Data and the Three/Four/Five Vs
Data discussions these days, especially in the arena of 'Big Data,' tend to get obsessed with the three Vs. These are Volume, Velocity, and Variety. In other words, how much data is there, how fast does it come in, and how much variation is there in data type and form.
Sometimes we add a fourth V, of 'Value' (how much benefit does analysis deliver). Given enthusiasm elsewhere in this forum (and more generally) for Open Data, we clearly need to find a European language in which 'Open' or 'Free' or 'Unencumbered by nasty licences' translates into a word that begins with a 'V' !
Of the original three Vs, most attention is typically given to the first. "I have 10 Petabytes of data" is somehow more interesting than "I must deal with 10 readings every nanosecond" or "I have ten different types of data all coming at me for analysis."
Although the undue emphasis on volume is understandable (the *BIG* in "Big Data" is an assessment of size, after all), it's also unfortunate.
What can we do, in the DAA discussions and its outputs, to ensure that Velocity and Variety receive the consideration they are due?