28 million words, one corpus, and thousands of fascinating insights
Have you ever been told as a child to ‘stop daydreaming’ and pay attention? Then you will be interested to know that daydreaming is a word that is invariably used in a negative context by adults but in a much more positive sense by children. Examples from the Oxford English Corpus (a vast electronic collection of texts used by Oxford’s lexicographers to craft accurate definitions for their adults’ dictionary range) show that daydreaming is considered a ‘distraction’. On the other hand, in the Oxford Children’s Corpus, children happily daydream about going to fantastic places and having amazing adventures.
What is the Oxford Children’s Corpus?
The Oxford Children’s Corpus is the first of its kind: a language database of over 28 million words of writing for 5–14 year-olds. It is made up of children’s fiction, non-fiction, print and web material, classic and modern texts, and a growing section of writing by children. It is used by lexicographers, publishers, teachers, and scholars, providing insights into the essential features of language for children in comparison with language for adults. Since 2006, lexicographers at Oxford University Press have used it to help with the compilation of dictionaries for children.
Beetles, sweets, blobs, and Mum: examples from fiction
Does language written for children differ substantially from language written for adults? Let us look at some examples gathered by comparing the Oxford Children’s Corpus with the Oxford English Corpus which is used to create the adult range of dictionaries.
When looking for an illustration of the word blob an adult might think of a blob of toothpaste when the Oxford Children’s Corpus will provide a description of warm rice pudding with a blob of strawberry jam on top, or talk of a gurgling, squelchy blob. You are at least three times more likely to come across Grandad, Mum, and Dad, in children’s fiction than in adult fiction, and at least ten times more likely to find toffee and sweets. Nature words such as headland, beetle, moor, creepers, bog, and twigs are much more frequent in the Oxford Children’s Corpus than in the Oxford English Corpus. Colourful verbs like peeped and spluttered also occur more frequently in children’s fiction along with adverbs such as crossly and doubtfully. Our lexicographers find a children’s corpus particularly useful to find appropriate example sentences, as well as to establish or refine words to include in children’s dictionaries and thesauruses.
Drug barons or Norman barons?
The Oxford Children’s Corpus is vital in helping our lexicographers to reflect changes in language and appropriate contexts in our dictionaries for children. The same word often has different contexts or usages in children’s language and adult language. Take the word baron, for instance. While the Oxford Children’s Corpus will give you wealthy barons and Norman barons, the Oxford English Corpus has an interesting selection of drug barons and media barons. Or the word hand in the Oxford Children’s Corpus will appear mostly in its literal sense, as the body-part hand, whereas the Oxford English Corpus has a wider range of the extended and figurative uses of hand, e.g. dismiss/reject out of hand, suffer at the hands of, play in to someone’s hands, the task/topic at hand.
The Oxford Children’s Corpus is one of its kind in tracking the language of children today and giving us many typical and entertaining examples of language written for children. It highlights how children perceive the world differently from adults. So the next time you find yourself daydreaming, why not take a childlike approach to it, and indulge in a dream about fantastic places and amazing adventures, with lashings of toffee, beetles, creepers, moors, and sweets from Grandad!