How can World Englishes benefit from crowdsourcing?

Crowdsourcing is one of the biggest Internet buzzwords today. Believed to have been coined by a US journalist in 2006, the word refers to the practice of accomplishing complex tasks by enlisting the help of a large number of people. Using the Internet to harness the collective intelligence of a crowd has proved successful in a wide variety of contexts, from software design and genetic research to documentary film-making and crime investigation.

The OED: crowdsourcing since 1857

The word crowdsourcing may be new, but the idea behind it is not, at least not in lexicography. In fact, the entry for crowdsourcing in Wikipedia (itself a stellar example of an effective crowdsourcing model), gives the Oxford English Dictionary (OED) as one of the earliest predecessors of today’s largely Internet-based crowdsourcing projects.

Much of the historical and lexical information contained in the OED is based on the evidence of millions of quotations collected from English texts through the dictionary’s Reading Programme. Through this programme, the OED recruits voluntary and paid readers to gather quotations that illustrate the usage of words.

The OED Reading Programme started in 1857, when volunteer readers began to collect quotations for the British Philological Society’s planned New English Dictionary. Two decades later, the dictionary’s new editor, James A. H. Murray, launched a broader Reading Programme by publishing an appeal for volunteer readers, not only in Britain, but also in America and the British Colonies. “Anyone can help,” Murray wrote in his 1879 appeal, and soon after, he began receiving thousands of quotations from hundreds of volunteers, most of whom were interested laypeople instead of language specialists.

The impact of technology

The advent of electronic technology and the Internet has greatly facilitated user participation in dictionary making. The OED Reading Programme is still in existence, except that now, instead of being written on slips of paper and mailed to Oxford, quotations are keyed directly into a database where they can be quickly accessed by OED editors. A number of dictionaries of English and of other languages have also begun to experiment with different ways to involve the public in content creation. User participation can be direct, as when a dictionary asks people to add new entries or make modifications to existing entries. Users can also contribute indirectly to dictionaries by giving feedback via email or web forms, or providing lexical evidence that dictionary editors can use, as people continue to do through the OED Reading Programme and the OED Appeals website, where the public can submit earlier records of particular words being researched by the dictionary’s editors.

Gathering data for World Englishes

One type of language that can potentially benefit from crowdsourced lexicography is World Englishes. One major stumbling block towards the balanced and comprehensive coverage of less widely used varieties of English in general-purpose dictionaries is the lack of textual evidence for them: published material in postcolonial English-speaking countries such as India, Singapore, and the Philippines still follow British or American standards, and there are as yet very few language corpora for these varieties that are large enough for lexicographical research. The crowdsourcing model can solve this problem by making it possible to gather word data for World Englishes directly from its speakers.

Such a model is currently being used for the new Pinoy English Community Dictionary (PECD), an OUP-sponsored project to collect examples of Philippine English usage. The project will create a public, online glossary of the variety of English spoken in the Philippines, as well as help OUP identify Philippine words and expressions that can be included in Oxford’s dictionaries: not only the OED but also dictionaries of current English, such as The Pinoy English Community Dictionary, a preliminary version of which is now accessible online, is an Internet platform where users of Philippine English can suggest new words and senses, and even create entries that will be searchable and viewable by the general public.

The Pinoy English Community Dictionary is also supported by a dedicated Facebook page and Twitter feed, where calls are regularly put out to the public to provide evidence for certain expressions or semantic fields. This evidence can be in multimedia form, such as photographs, audio, or video showing Philippine English as it is used by native users in local contexts.

Engaging with the Pinoy English Community Dictionary

By linking its online platform to social media sites, the project will provide two ways in which the community can engage with the dictionary. Those particularly interested in lexicography can participate directly by suggesting, writing, and modifying entries for the dictionary website. Recent additions to the dictionary range from local food borrowings (ube, a native root crop known for its purplish hue; bagoong, a fermented shrimp or fish condiment) to common abbreviations (ref for refrigator; aircon for air conditioner) and unique word derivations (monthsary, monthly celebration of the start of a relationship; carnapping, the act of stealing a car).

The public can also contribute indirectly by submitting evidence through social media sites that they are already familiar with. Information gathered in this indirect way is no less important than actual user-written entries: for instance, a post on the PECD Facebook page asking followers for their favourite Filipino dessert not only produced a long list of interesting new words, but also information on spelling variants (halo-halo, a dessert of crushed ice and milk mixed with fruits and other sweets, can also be spelled haluhalo, halu-halo, and halo halo; macapuno, a syrupy dessert made from a rare local variety of coconut, can also be spelled with a k).

In its early stages, this collaborative approach seems to be a promising way of combining the expertise of OED editors with the insider knowledge of local informants. It would also be interesting to see whether the community dictionary can do for Filipinos what James Murray was able to do for the British and American public with his 1879 Appeal: demonstrate that lexicography is not an activity limited to Oxford scholars; that Filipinos themselves, as users of Philippine English, can contribute to making a valuable resource for their language.

