The big data era has created valuable resources for public interest outcomes, like health care. In the last 18 months, the speed with which scientists were able to respond to the covid-19 pandemic—faster than any other disease in history—demonstrated the benefits of gathering, sharing, and extracting value from data for a wider good. Access to data from 56 million National Health Service (NHS) patients’ medical records enabled public health researchers in the UK to provide some of the strongest data on risk factors for covid mortality and features of long covid, while access to health records sped up the development of lifesaving medical treatments like the messenger-RNA vaccines produced by Moderna and Pfizer.
But balancing the benefits of data sharing with the protection of individual and organizational privacy is a delicate process—and rightly so. Governments and businesses are increasingly collecting vast amounts of data, prompting investigations, concerns around privacy, and calls for stricter regulation. “Data increasingly powers innovation, and it needs to be used for the public good, while individual privacy is protected. This is new and unfamiliar terrain for policymaking, and it requires a careful approach,” wrote David Deming, professor and director of the Malcolm Wiener Center for Social Policy at the Harvard Kennedy School, in a recent New York Times article.
A growing number of startups—some 230 and counting, according to Data Collaboratives—are helping to empower citizens, nonprofit groups, and governments to gain more control over their data. They are adopting legal and institutional structures like data trusts, cooperatives, and stewards to provide people and organizations with a means of securely collecting and effectively using relevant data—and in the process, taking on Big Tech’s control of the data economy.
“The relationship between data and society is fundamentally broken,” says Matt Gee, CEO of Brighthive, which helps networks and organizations set up alternative governance models including data trusts, data commons, and data cooperatives. “We think it should be more collaborative instead of competitive, it should be more open and transparent, it should be more distributed and democratic instead of monopolistic. This is how we make the gains more equitable and reduce harmful biases in data.”
Access and control
As demonstrated by the pandemic, medical research and public health planning can be enriched by access to electronic health records, prescription and medicines data, and epidemiology. But health data are also highly sensitive, with understandable public scrutiny over efforts to share them. So-called “secondary use,” which applies personal health information for uses outside health-care delivery, requires a new governance framework.
Findata is an independent authority in the Finnish Institute of Health and Welfare, established by a government act in May 2019. The agency facilitates researchers’ access to Finnish health data, issuing permits for use or responding to specific statistical requests. In so doing, it aims to protect the interests of citizens while also appreciating the value that their data could offer to medical research, teaching, and health planning.
Prior to the formation of Findata, it was costly and complex for researchers to access this vital research resource. “The purpose of this agency is to streamline and secure the use of health data,” explains Johanna Seppänen, director of Findata. “Before, if you wanted to have data from different registers or hospitals, you had to apply for data separately from each data controller, and there were no standard ways of handling them, no ways to determine prices. It was very time-consuming, difficult, and confusing.”
Findata is the only agency of its kind so far, but it might inspire other countries that want to realize more value from health data in a safe and secure way. The UK’s NHS recently faced pushback from privacy campaigners over reforms to improve data sharing for public health planning, showing the challenges that can come from attempts to change data collection and sharing protocols.
Empowerment and autonomy
Helping disenfranchised individuals and groups has been another focus area for new data governance organizations. Data stewards—which range from community-based collectives to public or private organizations—serve as “both intermediaries and guardians during the exchange of data, thereby supporting individuals and communities to better navigate the data economy and better negotiate on their data rights," says Suha Mohamed, strategy and partnerships associate at Aapti, an organization working on the intersection of technology and society with a focus on data rights.
One example of where data stewards can prove useful is for individuals in the gig economy, a fast-growing labor market that has been characterized by the prevalence of short-term contracts or freelance work, as opposed to permanent jobs, and has been rife with power inequalities. “Asymmetric control of data is one of the primary levers of power that gig platforms use to manage their workforce and shape the narrative and public policy in the arena that they operate in,” says Hays Witt, co-founder and CEO of Driver’s Seat, a driver-owned data cooperative specializing in ride-hailing. “Very few stakeholders have access to the data they need to engage in productive and constructive ways, starting with gig workers themselves. Our premise [at Driver’s Seat] is: let’s use tech and a data cooperative to empower gig workers to collect, aggregate, and share their data.”
Driver’s Seat has developed a proprietary app through which workers can submit their location, work, and earnings information, which is then aggregated and analyzed. Drivers then receive insights that help them understand their real earnings and performance, informing their choices about where, when, on what platforms, and on what terms to work. Driver’s Seat is developing tools that can tell drivers their average real pay across platforms in their city, compare their pay with averages, and tell them whether their pay is going up or down. All of this could help drivers move to platforms that offer them a better deal, empowering what is an otherwise atomized labor force.
“Our drivers are really excited to be engaged, because their day-to-day experience is seeing metrics, fed back to them by the platforms, that they don't trust,” says Witt. “They know that the metrics are influential, their day-to-day experience is totally mediated by data. It impacts their earnings and their life, and they know it.” Witt believes that in the future, workers will increasingly be able to contribute to crowdsourced information to develop “collective analyses of their problems, which means they can put forward collective policy solutions or agreements to negotiate with the employment platform.”
Balancing social mission and business models
All data equity startups, whether they are government-sanctioned institutions like Findata or entrepreneurial businesses like Driver’s Seat, face the challenge of balancing their mission with operational sustainability. Securing a sustainable financial footing is a major challenge for nonprofit groups and social impact businesses. For data equity institutions, the funding mix commonly includes community- and membership-driven approaches, and philanthropic aid.
But some organizations, like Brighthive, have found win-win models where private sector companies are looking to improve data governance and are willing to pay for it. Brighthive’s Gee describes commercial clients who have “seen what’s happening in the European Union around AI regulation and they want to get ahead of it in the US. They are taking a proactive stance on issues like algorithmic transparency, equity audits, and an alternative governance model for how they use customer data.”
Other data equity platforms have found revenue models in which beneficiary data can be harnessed by third parties in socially positive ways. Hays Witt at Driver’s Seat cites the example of municipal authorities and planning agencies. Both the authorities and ride-hailing drivers have an incentive to reduce “dead time” in which a driver is circulating without earning money, causing emissions and congestion. If appropriate data can be collected, aggregated, and analyzed in a useful way, it can lead to better traffic and mobility decisions and infrastructure interventions. So, all participants benefit.
Witt points out other “neutral” cases where beneficiary data could be valuable to unrelated private sector entities in ways that do not work against the interests of the drivers. He gives the example of commercial real estate developers who are often forced to make decisions about investments and services based on out-of-date traffic and mobility data. Driver’s Seat is exploring opportunities to offer aggregated analytics products to such companies with revenues returned as dividends to gig workers and to help finance the cooperative.
Many data equity startups seeking out sustainable revenue opportunities need to decide where to draw the line in terms of the kind of work they are willing to take on or the kind of businesses they’re willing to work with. Brighthive’s Matt Gee points to growing investor interest in startups that can help companies navigate the end of “cookies,” which have been critical to third-party advertising but are now being phased out. “Investors are concerned about the death of third-party data and are hungry for companies addressing that,” he says.
But as socially minded startups gain more business from corporate clients, they need to balance their mission for social good with the financial gain of lucrative contracts. “Is being a public benefit corporation more about what you do and how you do it, or who you work with? If we work on a data collaborative that provides transparency and accountability for marketing organizations pooling customer lists, are we actually reducing societal harm? These are questions that our team is constantly grappling with,” says Gee.
Data startups will inevitably face challenges, including balancing social mission, ethics, and business models, but as the data economy continues to grow, they are in a unique position to carve out new ways of responsibly leveraging the insight that data can provide for citizens, organizations, and governments—wresting some of the power over data away from Big Tech.
"Our data economy needs to anchor on creating value for everyone in society, and that requires user control, trusted intermediation, and collective governance to be embedded in innovative data stewardship models,” says Sushant Kumar, principal of responsible technology at social change venture Omidyar Network. "Onboarding a critical mass of users, receiving regulatory support, and achieving financial sustainability will also ensure these designs succeed in disrupting the status quo and injecting fairness into the current paradigm.”
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.