In the past decade, there’s been a steady breakdown of democracy in America.
Between Donald Trump’s demagoguery, ongoing government-sanctioned mass surveillance, and an electoral process dictated by media and money, even the concept of democracy is dangerously open to interpretation.
Data, that lauded harbinger of mathematical accuracy and scientific truth, seems like a simple way to pierce through the deception.
Data analytics, after all, has already changed various facets of the public sector: How police work is allocated, how public health threats are contained and even how police give parking tickets. With regulators exposing the inner workings of policy via open government initiatives, it’s just a matter of time until the public regains power in the political process.
Or is it?
Most people seem to hold a view that data is somehow closer to “truth”. This framing is so pervasive that the term “data-driven” has become synonymous – almost implicitly – with objective. Any practitioner will tell you that that data is always messy and a large number of assumptions are baked into every number.
More significantly, however, data is inert.
Many fail to realize that data is merely an ingredient. It needs a statistical model to give it a narrative – and that’s where the real power lies.
When the federal government reports 4.9 percent unemployment and Donald Trump claims “as high as 35 percent,” they’re using vastly different statistical models, but still pointing to data as a proof point.
That is the biggest problem with data and democracy.
Political narratives are self-serving and incomplete in the best of circumstances, but at least most people are aware of the potential biases and subjectivity. Saying that a political position is “backed by data,” however, lends it an air of credibility.
In reality, many layers of statistical processing may refract the narrative in one direction or another.
People on the left may cherry-pick data and set up analyses that prove that the American dream is being crushed under the weight of systemic inequities in the playing field. Bernie Sanders’ claim that corporate tax income receipts dropped from 30 percent of federal revenue in the 1950s to 11 percent last year insinuates that corporate tax breaks are the sole factor behind the dip, when in fact reduced corporate profitability and a trend towards different business structures are also responsible.
People on the right might use their own and separate analysis to prove that America’s economic woes mostly stem from over-regulation and an inefficient government.
When Ben Carson said, “last year, there were an additional 81,000 pages of government regulations,” the equivalent of a “three-story building,” he ignored the fact that many of those pages don’t actually contain regulations.
Data, by itself, may contribute more to fuel to the fire of the “he said/she said” game. Policymakers use a specific statistical lens to create an analytical narrative that sells, an interesting and different (but not too uncomfortable) narrative. That muddies the message provided by the data.
Ideally, policymakers will look for the truth, but that is time-consuming, complicated and difficult to understand. So how do we balance the need for simple communications with the truth in data and use the data to move forward as a democratic society?
Here’s where to start.
Using data to nudge democracy
As a first, crucial step in achieving greater democracy, policymakers and data scientists need to develop a set of practices and principles for data processing that everyone can agree upon. This will reduce the number of places for people to point fingers and spin narratives.
In the same way that the National Parks Service exists to protect public land and NAFTA governs trade in North America, data, too, should come with a set of industry best practices.
The government now has a chief data scientist, and he is well situated to champion the creation of guidelines around data governance and processing. Ideally these would draw from industry best practices and be based on widely-available open source tools.
Policymakers and data scientists should also work together to find new ways to empower citizens to make their own decisions. It is critical that data of the people can be analyzed by the people, so that we can ensure our government is working for the people.
What good is having the most accurate, open and transparent set of data of any government in the world (and we have all these great databases available) if only 0.01 percent of the population can do the statistical modelling necessary to turn that data into something meaningful?
The path we’re on now, in which only a small segment of society can work with data, leads to an undemocratic end result.
If this exponential growth of data simply creates an analyst class who are empowered to understand and shape the data-quantified world, then the chasm between the statistically literate and illiterate becomes ripe for political and economic exploitation.
In a sense, the popular view of data scientists as rare unicorns and saviors – who are singularly empowered to bring the glories of data to the masses, is misleading and undemocratic. In a world suddenly awash with data, we need to view statistical thinking and data science as the equivalent of literacy, with all that that implies.
Fuel democracy with data analytics
Most people at this point are aware that the amount of data in the world is growing exponentially. We also all know that information is power. But right now, the power of that data is in the hands of those who can analyze it and use it to mold narratives.
The only hope for a democratic society is to make sure the tools and knowledge of basic data science are accessible to all.
The current revolution in computational science, open source data analytics and predictive analytics is giving the average data scientist unprecedented abilities, like finding and delivering human traffickers to the justice system and making sure that nearly every medical treatment created in a lab will eventually find a use.
Imagine if even more people could be connected with data and given similarly disruptive abilities.
This isn’t just pie-in-the-sky thinking. It should be an imperative.
As members of a democratic society, it is our duty to pressure legislators to establish best practices around data and make it more accessible to everyone. Local representatives are a good place to start. Most government CIOs lag behind the private sector when it comes to data — 47 percent of CIOs state they “have a long way to go when it comes to enterprise data and governance,” and 49 percent indicate “some progress in developing operating discipline for data” according to NASCIO’s 2015 State CIO Survey.
The sooner these IT departments grow their data governance efforts, the sooner they can help voters benefit from transparency and data accessibility.
It’s also worth following the Electronic Frontier Foundation’s Transparency Project and ongoing changes to the Freedom of Information Act contributing to ongoing efforts to stop the federal government from siloing and controlling data in a way that is against the public good.
Another place to apply pressure is the school system.
The 2016 National Education Technology Plan focuses on “future-ready learning,” and exposing students to analytics early on should be a key component of that effort.
Comfort with statistics can be established through non-book means like digital- and board games. Districts should apply disruptive thinking in how they teach math because the math needed today is not the math that is taught in schools – and tomorrow, the difference will be even more pronounced.
All of this boils down to a single, actionable truth: Data can only save democracy if we let it.
Peter Wang | Peter is the CTO and Co-Founder of Austin-headquartered Continuum Analytics, where he focuses on creating the foundational tools for the future of data processing, analytics, and visualization.