Data are Beautiful: Data's story in grammar
Photo credit: Flicker:binaryape
A datum is a single piece of information. There are two plural forms for datum. The lesser known form, datums, is used exclusively in surveying and geodesy. The other plural form surrounds us both physically and figuratively. It can be big or small, right or wrong, new or old, dull or interesting. In the end it’s just a long line of 1s and 0s, stored right here or way over there. I’m talking about data.
There are even multiple ways to pronounce it.
Because data is a plural noun, it’s technically more correct in English to say “data are”. But in the real-world, using “data is” is fine, especially because it’s considered to be a mass noun. Outside of the real world, there is some debate between “data is” and “data are” ( here, here or here, for example). But languages evolve — I’m cool with that. I don’t even care if you say datas, as long as those datas are good.
It turns out that “data is” and “data are” occur about equally, after the strong decline of “data are” in the 80s and 90s. And this data ain’t for chatspeak, it’s text that was published and hopefully edited, according to Google’s N-gram viewer.
Who is more likely to still use the proper phrasing? It’s those pesky British English speakers and writers in red that are holding on. (As compared to American English, in blue.)
In all forms of English, an interesting observation stands out: When starting a sentence, the trend reverses. “The data are” and “Data are” are approximately twice as common as “The data is” and “Data is”.
I have no idea why, but I’ll list any good guess here:
Idea 2. ________________________
Data - it’s still more popular than sex, drugs, and rock & roll.
But only in books.
Across the entire internet,
sex still wins
This article is also available on Medium.