New York City’s Open Data Law, passed in 2012, was a landmark for sharing government data with the public in a machine-readable and reusable format. The law itself is fairly simple because it leaves the details to a Technical Standards Manual written by NYC’s Department of Information Technology and Telecommunications (DoITT). The Technical Standards Manual was first published in September 2012, and has recently been updated for the first time.
Oddly, the new Manual does not reflect the huge amount the public and city government have learned over the last three years about what has worked and not worked with implementing the world’s first open data law. This is a bit inexplicable, since insightful and detailed public feedback to the City’s open data efforts has been presented at half a dozen City Council hearings, and another dozen or so hack-a-thons and public events. Also strange is that the manual only refers to one of the seven new amendments to the Open Data Law that the City Council has passed in the last year.
Reinvent Albany and the NYC Transparency Working Group were unaware that a new update to the Manual was to be published soon, and we had hoped that the new Manual would incorporate the new laws and voluminous public feedback. Maybe DOITT will begin frequent updates of the Manual and this is the first installment. We shall see.
The new section, §7.4 on Data Dictionaries reads:
As mandated by Local Law 107 of 2015, all data sets on the Open Data portal must be accompanied by a plain language data dictionary, with the goal of making the data more understandable.
Outlined below are the minimum standards that must be adhered to:
- Agency name, data set name, data set description, and update frequency must all be provided
- Each column name should be listed and defined
- Where applicable and reasonable, terms, acronyms, codes, and units of measure should be defined
- To the extent practical, a range of possible values should be included
- History of modifications to data set format should be documented
Agencies may choose to provide additional information deemed relevant, including but not limited to, method of collection, relationship with or between other data sets, system of record, field lengths, etc.
Data dictionaries can be provided in a file format of an agencies choosing, but must include the above minimum requirements.
While we understand that Local Law 107 of 2015 mandates the creation of a data dictionary, and we welcome the effort to update the open data law and its regulations, this particular section is puzzling. Of the five bullet points, four of them are not mandates but suggestions. Further, the first two points merely repeat the requirements of existing sections of the Technical Standards Manual: §18.104.22.168 and §22.214.171.124, respectively.