USAID’s Evolving Open Data Culture
I learned last night at the latest Open Data Leaders Meetup in Washington D.C. that they really are serious about “open data” at the United States Agency for International Development (USAID).
Brandon Pustejovsky and Laura Hughes talked about the policy and action steps the agency is taking to make data generated by USAID programs in over 50 mission countries around the world available for analysis and re-use. Data generated by USAID programs must now, as a contractual requirement, be submitted to USAID’s Development Data Library (DDL) in machine readable form. The top of the web submission form now says,
- USAID staff, as well as contractors, and recipients of USAID assistance awards (e.g. grants and cooperative agreements) in accordance with the terms and conditions of their awards must submit Datasets to the Development Data Library (DDL).
- Please provide the following to register your dataset with the DDL. Upon submitting this form, you will be contacted by our staff with additional instructions for transmitting the dataset.
- Please do not submit classified data or data containing personally identifiable information, such as social security numbers, home addresses, and dates of birth. Such information must be removed prior to submission.
- Datasets must be submitted in machine-readable, non-proprietary formats such as .csv or .xml.
Behind the scenes there is much work going on in terms of data file inspections, developing data and metadata standards for different sectors, modifying legacy systems to accommodate new or changed data formats, clarifying data ownership, and modifying contracting and procurement procedures to accommodate the shift. A corps of 100 “data stewards” has been developed throughout USAID locations around the world to coordinate data collection and the agency’s collaboration and communication infrastructure are being used to explain requirements and share best practices.
My hat’s off to USAID for doing this. You don’t overnight “flip a switch” and turn from receiving reports in .pdf format to building datasets that can be analyzed by many different stakeholder groups, as those involved in Data Act implementation are well aware. Pustejovsky and Hughes’ in their presentations just skimmed the surface of the internal deliberations that have been going on, but the results are definitely appearing as the number of available datasets increases.
USAID is also researching how to make the data useful, starting with the surveying of potential users about what they would like to see and sponsorship of special grants and “hackathons” to promote data usage. After all, if the data are never used again after they are generated and submitted to the DDL, why go to the expense of putting systems and processes in place to make them accessible for reuse and exploitation?
I look forward to keeping up with how USAID works through the process of making its data “useful.” One of the common deficiencies of many initial open data portal efforts is that they might provide extensive data files and tools for filtering and visualization but they don’t necessarily go the “extra mile” by ensuring that data and data context are useful, available, and meaningful. This extends beyond the features of the user interface to include accommodation of the user’s data literacy, the provision of information to help the user interpret the data’s meaning, and — a really important one, in my opinion — information about the stakeholders most concerned with and knowledgeable about the data.
Ultimately, how open data efforts are managed needs to take into account the fact that the process of making data open and available must be part of every program that generates the data, not something that is tacked on after the fact. This means that open data planning needs to start when any data-generating initiative is planned. It appears that USAID is going that route.
Copyright © 2014 by Dennis D. McDonald