Transparency, Open Data and New York’s COVID-19 Emergency


It’s front page news: Governor Andrew Cuomo is under federal investigation for withholding the true number of deaths of nursing home residents to COVID-19.

This terrible virus has spotlighted the importance and clear potential of open data to dispel myths and lies in the midst of a crisis. New York’s 1970s era Freedom of Information Law says, “The people’s right to know the process of governmental decision-making and to review the documents and statistics leading to determinations is basic to our society.” If that law was written today, it would have required state government to publish open data.

Last week, nine government watchdogs from across the political spectrum sent a letter to New York State leaders urging greater public transparency for all aspects of New York’s COVID-19 emergency and the state’s response. The groups also suggested a list of 121 COVID-19 related datasets that should be published as open data by the state – machine-readable, downloadable and available in a publicly usable tabular form. Many of these open datasets are already published by other state and local governments, including New York City. The datasets range from the basic – “Total Persons Tested” – to the specific – “Records of Advice and Expertise by Outside Experts on Retainer.” These 121 datasets are only a start – a more comprehensive review could produce hundreds more.

Many states and localities already provide the kind of open data that New York State lacks. New York City’s COVID-19 page, for example, contains easy-to-find links to open data beneath each available graphic. California’s open data portal provides open COVID-19 datasets on COVID-19’s impact on the homeless and COVID-19 assistance provided to older adults. Washington State’s dashboard links to a single, organized spreadsheet where the most basic information – cases, hospitalizations, and deaths – can be viewed for each individual county. New York State publishes many of these datasets in portals – but almost never as open data.

Open data transparency builds public trust and allows for independent analysis. Out of the 121 datasets listed, New York State already provides at least 70 – but only six as open data. So what can New York do to bring its COVID-19 transparency on par with other states?

Provide COVID-19 data as open data. New York’s Open NY portal places COVID-19 data at the very top on the front page, implying that the state is highly transparent about COVID-19. But of the 121 datasets we listed, only six are provided in an open format (all are related to testing). In many cases, the data available needs to be manually typed into a spreadsheet. Other datasets, such as nursing home deaths, can be copy and pasted from a spreadsheet – but this has to be done every single day to reveal trends. In fact, of the fifteen COVID-related datasets available in the portal, only one is from New York State – the rest are from New York City. It’s possible to get COVID-19 open data from outside sources – the Atlantic’s discontinued COVID Tracking Project is one example – but the public shouldn’t have to go to a commentary magazine to get open data that should be available from our own state.

Establish a “one-stop shop” for COVID-19 data. COVID-19 data is spread over at least seven different sources (five of which are in our list):

  1. the Department of Health’s COVID-19 data tracker;
  2. nursing home data PDFs;
  3. the Open NY portal;
  4. the New York Forward Dashboard;
  5. SUNY’s COVID-19 data page;
  6. the Vaccine Tracker; and
  7. Governor Cuomo’s press conferences.

It makes no sense for all of this data to be scattered across so many sources – these datasets should all be in the state’s Open NY portal. Curiously, one of the most frequently referenced datasets – deaths tracked by date – is not in any of the data portals. It is only available in Governor Cuomo’s press conferences.

Make COVID-19 trends easy to understand. Despite providing 70 datasets, most of the data available does not allow for viewing of COVID-19 trends. While users can view testing by county and date, the portal does not allow for viewing of trends for hospitalizations, fatalities, or nursing home deaths. The information void hampers New Yorkers’ ability to see how the COVID-19 crisis has evolved and may be affecting them. Spreading visualizations out over at least seven different sources makes the system even less user-friendly.

New York already provides a great deal of COVID-19 data – the key is to make that data open and accessible. The nursing home data scandal has tested confidence in our institutions, and full data transparency is one of the best ways state leaders can restore civic trust.