What is the plural of data?

Your data was corrupted…

Wah! First thought: I’ve lost some work. Second thought: shouldn’t that be ‘…data were corrupted’? In the strictest sense, yes, because it’s all a question of ensuring that you match singular subjects with singular verbs, and ditto plural subjects and verbs, a process called agreement. Easy when it’s a straightforward case, such as: she is in New York; they are in London. However, the situation becomes more complex with certain types of nouns and plurals of words which entered English from other languages, such as Italian or Latin, which is where data comes in.

You might recall that I’ve blogged before about subject-verb agreement, collective nouns, and the formation of plurals of nouns borrowed from other languages. There wasn’t sufficient space for me to cover all the bases within those articles, so I’m going to dip a toe into these grammatical waters yet again.

In particular, the usage of data still generates debate: I reckon they’re deserving of a blog all to themselves, so here goes. What do these two nouns have in common? Firstly, they’re Latin plurals which are now firmly established in English; secondly, many people are in a quandary about whether they should be treated as singular:

Overall, the data gives some mixed messages.

The media has a role to play in informing the public debate.

or plural:

The data were gathered from the World Resources Institute.

The media are dependent on major businesses as information sources.

All the above examples were found in edited texts on the 2.5 billion-word Oxford English Corpus (OEC), showing that, even for experienced writers, the situation can be a little bewildering:  I hope you’ll find it less so by the end of this piece.

Singular and plural data

This word is the plural of the Latin noun datum, which literally means ‘something given’. The historical Oxford English Dictionary (OED)’s first recorded meaning (1630) of datum in English is ‘an item of (chiefly numerical) information’, and the first citation is actually for the plural form, data. The singular form, datum, has always been much rarer than the plural in English: there are only 917 instances of this on the OEC, compared with 542,151 for data.

Datum is treated as an English singular noun and is found predominantly in scientific and technical contexts. Up until about the mid 20th century, data occurred in similar fields of study and was typically treated as a plural (‘…these data..’; ‘…the data are classified…’). OK, so we can all scoot off and devote ourselves to something more rewarding, yes? Well, no. Nowadays, data has two main meanings:

  • facts and statistics collected for reference or analysis: overall, the data gives some mixed messages. 
  • the quantities, characters, or symbols on which operations are performed by a computer: Internet 2 continues to break records for transmitting data. 

The word data in English usage has evolved: a mass noun use, recorded in the OED from the 18th century, has become increasingly common over the past 70 years, particularly in computing and general contexts. Mass nouns are those which can’t be counted (for instance, happiness, concrete, warmth) and they are always accompanied by a singular verb. Data is now treated in the same way as its near-synonym, the mass noun information. This is well established and generally accepted in standard English writing:

Data was collected over a number of years.

However, you should be aware that not everyone concurs with this: some authorities and organizations take a stricter line than others. For instance, the most recent edition of Pocket Fowler’s Guide to Modern English Usage accepts this mass noun use in general and computing contexts. The US grammar expert Bryan Garner, however, writing in Garner’s Dictionary of Legal Usage states: ‘The Oxford Guide allows the singular use of data in computing and allied disciplines; whether lawyers own computers or not, they should use data as a plural’.

It also depends which sphere you’re active in: in scientific and technical writing, you’re more likely to find datum in the singular and to see data treated as a plural, but in general usage, many people now agree that data can take a singular verb. The best advice is that it’s always prudent to check whether your college or work organization has a style guide which rules on contentious grammar issues such as this before putting pen to paper or fingers to keyboard.

A final point: once you’ve opted for treating data as either a singular or plural, check the rest of your writing. It’s good practice not to use data with a singular verb in one place and a plural one elsewhere, and to make sure everything else matches up grammatically too. For instance, are the following sentences good English, and if not, why?

This data were matched to a geographically based model.

There is many data of this type.

The answer is no: if data has a plural verb, as in the first example, then you should use the plural determiner these, not the singular this. Likewise, if you treat data as a mass noun with a singular verb as in the second case, then you can’t say many. Many can only be used with countable nouns (many children; many tables), so you need to use much instead, the correct determiner for mass nouns:

✓ These data were matched to a geographically based model.

 There is much data of this type.

The development of alternative plural forms: medias and datas

Finally, a slight digression. Are your grammatical hackles raised by the following sentences?

I was looking for technical datas.

My grandfather took a data from his own excellent heart.

I have to admit that I looked askance at these when I first researched them in the OED and elsewhere! However, if you consider data and related Latin-derived noun media to be singular, I guess there’s a certain logicality behind the impulse to use them with the indefinite article a, and to create a plural form with the regular English -s ending.

The OED‘s evidence shows that the plurals datas (first recorded in 1645) and medias (appearing in 1927) have a long history and crop up in varied sources, from scientific and academic journals to The Times. This doesn’t mean that they’re accepted by everyone today, though. Some authorities sit very firmly in the ‘anti’ camp, with Pocket Fowler stating ‘above all, never write a media or the medias‘ and Bryan Garner remarking: ‘medias, which has recently raised its ugly head, can only be described as illiterate’. So my advice would be to handle these forms with care, unless you know they’re approved in the context within which you’re working.

A handy summary

Data can take either a singular or plural verb in standard English, but be consistent within a piece of writing, always check the style policy of your organization, and make yourself familiar with the grammatical debate that exists around them.

