As more cities track phones to improve subway service or use cameras to enforce speeding, questions of privacy — of what, exactly, that data will be used for and how long its shelf life will be — become increasingly important. Unfortunately, researchers from Harvard’s Berkman Klein Center for Internet & Society say many of the international frameworks that exist to protect individual privacy don’t go far enough.
In a new report, “Open Data Privacy Playbook,” the researchers look at the inherent conflict of open data: “granular” data — raw and record-level information — is the most useful for business, policy and research purposes, but it also contains the most detailed personal information, which carries the most risk.
“Just as open data is not valuable unless it is detailed, opening data will not be effective if it necessarily involves risks to individual privacy,” the researchers, led by Ben Green and Susan Crawford, write.
Problematically, most privacy frameworks focus on “identifying and removing personally identifiable information” — but because so much data is now available from so many sources, hackers (or others with ill intent) could potentially use a piecemeal approach to identify individuals, even if the data sets are made anonymous. And that could be a real problem — especially with something like crime data about sexual assault.
“Crime data is simultaneously one of the most useful and desired municipal datasets and, especially in the case of sexual assault data, one of the most sensitive,” the researchers write. “While open data about sexual assault and domestic violence can be a powerful tool for research and advocacy,” the opening of that data could lead to victim re-identification.
“Neither regulations nor ordinances provide sufficient clarity, as data publishers and consumers are moving faster than lawmakers,” they write. “This leaves open data officials in the position of often serving as de facto privacy arbiters.”
The paper offers a number of recommendations to officials charged with this daunting but important task, including:
- Conducting risk-benefit analyses to inform the design and implementation of open data programs.
- Considering privacy at each stage of the data lifecycle.
- Developing operational structures and processes that codify privacy management widely throughout the city.
- Emphasizing public engagement and public priorities as essential aspects of data management programs.
The report is especially timely as more cities hire chief data officers who specialize in discerning patterns from granular data.
“Today, data is plentiful but insight is far less common,” Jane Wiseman wrote for another paper on CDOs that Next City covered in January. Hopefully, as more data is collected, that insight will extend to privacy protection, too.
Rachel Dovey is an award-winning freelance writer and former USC Annenberg fellow living at the northern tip of California’s Bay Area. She writes about infrastructure, water and climate change and has been published by Bust, Wired, Paste, SF Weekly, the East Bay Express and the North Bay Bohemian