Tongue twisters - digitizing human languages
© Copyright 1994-2002, Rishab Aiyer Ghosh. All rights reserved.
Electric Dreams #29

Humans do not follow an ISO standard; nor do their hundreds of languages. Unlike computers, which have - rational creatures that they are - well-defined protocols governing their interactions with each other and well- defined techniques for otherwise incompatible machines to communicate, humans are chaotic. Knowledge produced by people has no uniformity. Limited by the particular language in which it is expressed, knowledge cannot be universally accessed.

As long as agriculture ruled the world, this limitation of human communication was not important. Rice looks like rice, and three gold coins remain three in German as in Swahili. Industrial economies brought complications. Their somewhat greater dependence on trade over large distances required translators who understood buyer and seller. But the end-user of Japanese cars does not need to speak Japanese, though those on the factory floor may speak nothing else. Industrial economies are based on a number of intermediaries, with the effect that differences in language between the producer and consumer become minor. Besides, cars have nothing to do with grammar.

An information economy is drastically different. Though it has a greater dependence on global trade than does industry, it also has the close ties between actual producers and consumers found in traditional (pre- industrial) agricultural economies. Current, mainstream sources of information such as the press use a model of extensive filters and editorial control between the real sources and the readers. More suited to an information economy is the increasingly common newsgroup model on the Internet, where producers and consumers of information interact directly, often changing places. Unlike cars, these goods being traded are usually extremely dependent on grammar, vocabulary and the various nuances of linguistic expression.

Moreover, unlike trade in industrial economies a true information economy requires that the final consumer and the original producer understand each other. This is difficult in today's world, unless they speak the same language. This is one reason why a huge majority of cyberspace works in English; while there are significant populations of users in France, Germany and the Scandinavian countries, they tend to form their own communities separate from each other and the larger English-speaking one. If the language barriers to trade in information remain, then these communities will develop into independent economies, along with those in Japan, East Asia and perhaps a dozen in India itself. So much for the global village.

Naturally believers in the Information Dream don't accept this view of the future. An information society is most appealing when it transcends boundaries of nationality and community. This can only happen when it also transcends the boundaries of language.

One absurd solution - what I call the 'world government' approach - is that everyone should speak the same language. This is impractical, and definitely against the spirit of free expression.

More sensible would be a technological approach. Computers may be a long way from true understanding of natural language - but they can translate despite that. To begin with, a simple, automatic translator could just use a dictionary of word equivalents - like the amazing Japanese photocopying machine that, given a sheet of English, outputs a crude translation in Japanese. Dictionaries can be replaced by phrase books, then limited grammars and so on. The point is that everything should be automatic and as transparent as possible; that the linguistic divide, even if still present, should be reduced from a wall to a fence.

The Internet exemplifies the basic rights of an information society - to express any idea, to receive ideas, and to find and communicate with those with similar ideas. When technology provides convenient ways of communicating across a language barrier, then perhaps knowledge will truly be universal.

