TLDR: We should always be working to improve our specific data. But let’s keep in mind that the process itself builds our general knowledge of the world. And the real solution is to increase the number of people viewing and contemplating the data.
Nate Silver recently said: “We’re not that much smarter than we used to be, even though we have much more information — and that means the real skill now is learning how to pick out the useful information from all this noise.” How true is this statement, really? We might not have reached our knowledge or wisdom capacity, but we are smarter, at least commensurately with the objectives of big data. So while it’s true that “marketers might know less about individual consumers than they think,” maybe that’s a good thing. Maybe understanding the abstract consumer is better — politically, economically, and ethically — than knowing everything about every individual consumer. Not just because of information overload, which is an important concern, but also because we don’t necessarily need that level of accuracy to build general and universal knowledge about our society and species as a whole.
That’s why, although professional concerns and wanting clients have the most accurate and up-to-date data they can remain priorities, we are not necessarily panicking at the continuous flow of anecdotes and surveys indicating we’re in a “bad data epidemic.” Take the recent Deloitte University study of U.S. professionals: a little over a hundred professionals were asked to privately study their own data (that is, the things about them others have extracted) and 71 percent overall found substantial inaccuracies, with a high of 84% inaccuracy for economic data, and a low of 41% for “home” data. Sure, that’s concerning in terms of reaching specific customers, but even inaccurate specific data (wrong addresses, age off by a year, profits miscalculated) may aid in (or at least not overly hinder) the generation of metadata for purposes of things like consumer analysis, AI, and some sociological data.
What inaccurate data can’t generate is good choices for those whose primary use of data is personal contact and the maintenance of contact-based relationships. So this isn’t so much an issue of data failing because it’s so often inaccurate, but that the “art” of metadata and the “science” of work that relies on specific accuracies and the ability to append diverge, at least to a degree.
For those that rely on strict accuracy, if we do the math, six percent of annual revenues (on average) are lost to bad data. This is bad for consumer-driven work where small margins can keep a business in business or kick it to the curb. It’s bad for political work where a candidate needs every extra vote they can get in a tight citywide or district race. In the world of metadata and broad analysis, people are more “fungible” than they are in tight races and competitive markets.
In fact, since we are always (hopefully) in the act of improving the specific and objective accuracy of data we might also consider how to improve its general insights and usefulness. One way to do this is to further “democratize” data, increasing the number of “eyes” on it, from the collection to the integration to metaanalysis. Democratized data gives us the best kind of accuracy — accuracy that serves the people.
To determine how democratic data actually is, we can ask questions like: can everyone locate the data? Recently, Wichita State University wanted to get better measures of student satisfaction, retention, and dropout rates. In order to do this, a team of analysts created a “data fabric architecture” eliminating “silos across departments” and providing “a single view of university data regardless of its location.” The work made it easier for everyone to report new data. The result? “WSU got to the root cause of why students leave, deployed mitigation plans and increased the numbers of alumni who graduated. The agile, scalable AI and analytics platform enables further examination of student and operational trends.” The more people can see the data, particularly those people with a stake in the knowledge, the more likely we will be able to maximize both the short-term details and long-term wisdom of its conclusions.