All posts by Don

2008 Linguapax Prize winner: Neville Alexander

The recipient of the Linguapax Prize for 2008 is Dr. Neville Alexander of South Africa. The prize is awarded annually (since 2000) in recognition of contributions to linguistic diversity and multilingual education.

Although the Linguapax site does not at this writing have updated information, the website of the UNESCO Centre of Catalonia (which is connected with Linguapax) has this press release dated 22.02.2008:

The South African linguist Neville Alexander will receive the Linguapax Award today in Barcelona, on the occasion of the Mother Language Day. The ceremony is framed in the Intercultural Week organised by the Ramon Llull University. Alexander, who coordinates the Project for the Study of Alternative Education in South Africa has devoted more than twenty years of his professional life to defend and preserve multilingualism in the post-apartheid South Africa and has become one of the major advocates of linguistic diversity.

There is various material online about Dr. Alexander including:

I don’t want to be negative about the Linguapax Institute‘s efforts, but publicity about this really has been lacking. An email request to Linguapax for more information received no reply. I hope to have more information about Linguapax and its important work in a future posting.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

2008 Year of the Frog

Logo of the Year of the FrogAnyone can declare a year of something and several conservation organizations have combined to declare 2008 the Year of the Frog. They’re calling attention to the importance of amphibians (apparently about half the species in the world are threatened or endangered), and to an “Amphibian Conservation Action Plan” of which something called an “Amphbian Ark” is planned “in which select species that would otherwise go extinct will be maintained in captivity until they can be secured in the wild.”

Oh, and this being “leap day,” some have apparently jumped at the chance to call it International Day of the Frog.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

International Year of the Potato

IYP logo, with permissionThe UN has given 2008 several designations, of which International Year of the Potato (IYP) is another one.* The reason for IYP is given as follows:

The celebration of the International Year of the Potato (IYP) will raise awareness of the importance of the potato – and of agriculture in general – in addressing issues of global concern, including hunger, poverty and threats to the environment.

The origin of IYP was apparently a proposal by Peru within the UN Food and Agriculture Organization (FAO), which eventually resulted in a UN General Assembly resolution.

The IYP website is a nicely organized with information in the 6 UN languages, including activities for children.

* I’ve previously commented on the International Year of Planet Earth, and for the International Year of languages, have commented briefly and devoted a section of this website.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

Reflecting on “Computing’s Final Frontier”

In the March issue of PC Magazine, John Dvorak comments on four areas of computer technology in his column entitled “Computing’s Final Frontier“: voice recognition; machine translation (MT); optical character recognition (OCR); and spell-checkers. Basically he’s decrying how little progress has been made on these in recent years relative to the vast improvements in computer capacities.

I’d like to comment briefly on all four. Two of those – voice recognition, or actually speech recognition, and MT – are areas that I think have particular importance and potential for non-dominant languages (what I’ve referred to elsewhere as “MINELs,” for minority, indigenous, national, endangered or ethnic, and local languages) including African languages on which I’ve been focusing. OCR is key to such work as getting out-of-print books in MINELs online. And spell-checkers are fundamental.

Voice recognition. Dvorak seems to see the glass half empty. I can’t claim to know the technology as he does, and maybe my expectations are too low, but from what I’ve seen of Dragon NaturallySpeaking, the accuracy of speech recognition in that specific task environment is quite excellent. We may do well to separate out two kinds of expectations: one, the ability of software to act as an accurate and dutiful (though at times perhaps a bit dense) scribe, and the other as something that can really analyze the language. For some kinds of production, the former is already useful. I’ll come back to the topic of software and language analysis towards the end of this post.

Machine translation. I’ve had a lot of conversations with people about MT, and a fair amount of experience with some uses of it. I’m convinced of its utility even today with its imperfections. It’s all too easy, however, to point out the flaws and express skepticism. Of course anyone who has used MT even moderately has encountered some hilarious results (mine include English to Portuguese “discussion on fonts” becoming the equivalent of “quarrels in baptismal sinks,” and the only Dutch to English MT I ever did which yielded “butt zen” from what I think was a town name). But apart from such absurdities, MT can do a lot – I’ll enjoy the laughs MT occasionally provides and take advantage of the glass half full here too.

But some problems with MT results are not just inadequacies of the programs. From my experience using MT, I’ve come to appreciate the fact that the quality of writing actually makes a huge difference in MT output. Run-on sentences, awkward phrasing, poor punctuation and simple spelling errors can confuse people, so how can MT be expected to do better?

Dvorak also takes a cheap shot when he considers it a “good gag” to translate with MT through a bunch of languages back to the original. Well you can get the same effect with the old grapevine game of whispering a message through a line of people and see what you get at the end – in the same language! At my son’s school they did a variant of this with a simple drawing seen and resketched one student at a time until it got through the class. If MT got closer to human accuracy you’d still have such corruption of information.

A particularly critical role I see for MT is in streamlining the translation of various materials into MINELs and among related MINELs, using work systems that involve perhaps different kinds of MT software as well as people to refine the products and feedback into improvements. In my book, “smart money” would take this approach. MT may never replace the human translator, but it can do a lot that people can’t.

Optical character resolution. Dvorak finds fault with OCR, but I have to say that I’ve been quite impressed with what I’ve seen. The main problems I’ve had have been with extended Latin characters and limited dictionaries – and both of those are because I’m using scanners at commercial locations, not on machines where I can make modifications. In other words I’d be doing better than 99% accuracy for a lot of material if I had my own scanners.

On the other hand, when there are extraneous marks – even minor ones – in the text, the OCR might come up with the kind of example Dvorak gives of symbols mixed up with letters. If you look at the amazing work that has been done with Google Patent Search, you’ll notice on older patents a fair amount of misrecognized character strings (words). So I’d agree that it seems like one ought to be able to program the software to be able to sort out characters and extraneous marks through some systematic analysis (a series of algorithms?) – picking form out of noise, referencing memory of texts in the language, etc.

In any event, enhancing OCR would help considerably with more digitization, especially as we get to digitizing publications in extended Latin scripts on stenciled pages and poor quality print of various sorts too often used for materials in MINELs.

Spell-checkers. For someone like me concerned with less-resourced languages, the issues with spell-checkers are different and more basic – so let me get that out of the way first. For many languages it is necessary to get a dictionary together first, and that may have complications like issues of standard orthographies and spellings, variant forms, and even dictionary resources being copyrighted.

In the context of a super-resourced language like English, Dvorak raises a very valid criticism here regarding how the wrong word correctly spelled is not caught by the checker. However, it seems to me that the problem would be appropriately addressed by a grammar-checker, which should spot words out of context.

This leads to the question of why we don’t have better grammar-checkers? I recall colleagues raving in the mid-90s about the then new WordPerfect Grammatik, but it didn’t impress me then (nevertheless, one article in 2005 found it was further along than Word’s grammar checker). The difference is more than semantic – grammar checkers rely on analysis of language, which is a different matter than checking character strings against dictionary entries (i.e., spell-checkers).

Although this is not my area of expertise, it seems that the real issue beneath all of the shortcomings Dvorak discusses is the applications of analysis of language in computing (human language technology). Thus some of the solutions could be related – algorithms for grammar checking could spot properly-spelled words out of place and also be used in OCR to analyze a sentence with an ambiguous word/character string. These may in turn relate to the quality of speech recognition. The problems in MT are more daunting but in some ways related. So, a question is, are the experts in each area approaching these with reference to the others, or as discrete and separate problems?

A final thought is that this “final frontier” – what I have sometimes referred to as “cutting edge” technologies – is particularly important for speakers of less-resourced languages in multilingual societies. MT can save costs and make people laugh in the North, but it has the potential to help save languages and make various kinds of information available to people who wouldn’t have it otherwise. Speech recognition is useful in the North, but in theory could facilitate the production of a lot of material in diverse languages that might not happen otherwise (it’s a bit more complex than that, but I’ll come back to it another time). OCR adds increments to what is available in well-resourced languages, but can make a huge difference in available materials for some less-resourced languages, for which older publications are otherwise locked away in distant libraries.

So, improvement and application of these cutting edge technologies is vitally important for people / markets not even addressed by PC Magazine. I took issue with some of what Dvorak wrote in this column but ultimately his main point is spot on in ways he might not have been thinking of.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

International Year of Planet Earth

IYPE Registered TMThe International Year of Planet Earth (IYPE) is another of the several “Year” observances declared by the U.N. for 2008 (I previously mentioned the International Year of Languages [IYL], and will come to the others later). It actually runs from 2007 through 2009, but had its formal launch at UNESCO on 12-13 February 2008.

The aim of IYPE is given as: “to capture people’s imagination with the exciting knowledge we possess about our planet, and to see that knowledge used to make the Earth a safer, healthier and wealthier place for our children and grandchildren.” It is intended to “support research projects within defined themes focusing on Earth Sciences in the service of society.”

IYPE is described as “a joint initiative by UNESCO and the International Union of Geological Sciences (IUGS)” which involves “[t]welve Founding Partners, 26 Associate Partners and a growing number of International Partner organisations from all continents and representing all major geoscientific communities in the world,” as well as about 70 national committees.

The approach is explained on the IYPE website as:

The International Year of Planet Earth aims to ensure greater and more effective use by society of the knowledge accumulated by the world’s 400,000 Earth scientists. The Year’s ultimate goal of helping to build safer, healthier and wealthier societies around the globe is expressed in the Year’s subtitle ‘Earth science for Society’.

The International Year runs from January 2007 to December 2009, the central year of the triennium (2008) having been proclaimed by the UN General Assembly as the UN Year. The UN sees the Year as a contribution to their sustainable development targets as it promotes wise (sustainable) use of Earth materials and encourages better planning and management to reduce risks for the world’s inhabitants.

This is clearly a substantial and well-organized effort, with important potential benefits in terms of public awareness, organizational networking, and longer-term outcomes.

When considering IYPE and IYL, it is tempting to contrast the resources and planning, but without going into that, the differences seem to derive mainly from IYPE having had a kind of consortium in place fairly early in the process. I think this is an important lesson for the success of any “Year” observance: to have a dedicated organization that can help coordinate observance and activities. I’ll return to this topic later.

I’m also tempted to see potential connections between IYPE and IYL – how can the two themes be linked in specific ways to enrich the impact of each?

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

International Mother Language Day & International Year of Languages

IMLD 2008 logoToday is the ninth annual International Mother Language Day (IMLD), and the date of the official launch of the International Year of Languages (IYL). UNESCO also has a portal page for more info in IYL.
I’ve posted various information about IMLD and IYL in a special section of this site on Support for IYL 2008. It includes links to pages relating to IYL and IMLD on the UNESCO site, as well as a lot of other items and links.

The IYL offers the opportunity to make the case for various initiatives on language and linguistic diversity. One of the things I’m hoping for is progress towards a more effective “civil society” network linking organizations and initiatives with diverse but complementary purposes. More on that later.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail