Until the last decade or so, the lack of computing power was the most significant hurdle to addressing complex problems. Even if the required data was available, we did not have the computing power to enable artificial intelligence (AI) to analyze and learn from large data sets. The situation has changed with the improvement of computing power generally and the migration to cloud computing. An organization is no longer required to make the overwhelming substantial investment to buy a mainframe computing system in order to run programs making massive amounts of calculations. Rather today, any organization can run highly complex programs using massive amounts of data on the cloud, only paying for what is actually used.
The COVID data story
The positive stories of how enhanced data sharing has helped to fight the coronavirus are numerous. One simply has to go to the Johns Hopkins University website that tracks real-time statistics on the spread of COVID. The dashboard uses data from a number of sources, including the World Health Organization (WHO), various countries’ centers for disease control, media reports, and health departments around the world. It has been such a useful tool that as of early April the dashboard had already been cited by scholarly journals over 79 times.
Another example of how opening up data sets has helped the COVID-19 response is the International Nucleotide Sequence Database Collaboration (INSDC). The INSDC is a tripartite collaboration between Japan, the United States and Europe. For more than 30 years, it has been committed to sharing DNA sequence information among scientists free of charge. These databases have allowed researchers worldwide to identify genetic mutations and causes of numerous diseases (and ways to treat or prevent them), including COVID-19. In fact, over 300 genetic variants of the SARS-CoV-2 have already been uploaded. The fact that these datasets are publicly available has allowed scientists from all over the world to start working on a vaccine, research the origins of the disease, and assess whether it is mutating to become more or less virulent. Much of this research is being done using AI. The more data that become available, the better the AI is able to function.
These are only two really good examples of how “open data” is working to help combat a world crisis. At the same time, there are countless other instances where greater access to data could help solve important problems. Could global hunger be significantly reduced by increased sharing of data? Could opening up datasets help fight corruption? The short answer to these questions is “yes.” The problems that additional data sharing could help improve are countless. This is why a number of companies and organizations have started initiatives pushing for actually opening up data and improving policies that help this cause.
Getting policies right
Incentivizing organizations, governments and even individuals to share data is clearly important. As I have opined on before, there are a number of other policy implications that should be considered in order to ensure that everyone benefits from the data revolution. The COVID crisis has brought many of these issues to the forefront. A good example is contact tracing, the process by which public health officials identify persons who have been in contact with individuals known to be infected with a disease. In the case of COVID-19, contact tracing is imperative to slowing the spread since the disease is transmitted so easily and many individuals who have contracted it are asymptomatic. The concept of contact tracing is simple: When public health officials find someone who tests positive for the disease, they attempt to find everyone that person has been in contact with and alert anyone who may have been exposed.
Cell phone location data can be very helpful in doing this effectively. In other words, the use of massive amounts of data owned by various telephone operators and ISPs are needed to run an effective tracing program. This obviously presents a number of policy concerns, including ownership and usability of the data, as well as privacy. Policymakers are starting to propose solutions, but we are a long way from enactment, much less implementation. The COVID health crisis will hopefully be under control sooner rather than later, but the need to develop policies for ensuring we get the most out of data will continue.
Tim Molino is a technology consultant at Peck Madigan Jones in Washington, D.C. He provides policy, strategy, and political advice on technology-related issues, including intellectual property, antitrust, privacy, and cybersecurity.