The Check List for an Awesome Open Crime Data Feed
Unfortunately, we’ve run into a few issues with the recently unveiled open crime data feed in Durham, NC. Because of this, we’ve created a checklist to help agencies create an awesome open crime data feed.
Differences in data
Most Popular datasets in Durham |
There are major differences between what Durham has chosen to display on its vendor’s site and what is displayed on the open data portal. We are not sure the reasoning behind delegating the money, resources, and time to create separate feeds or why the city chooses to give preferential access to information to a vendor rather than the public. Especially when it's the most popular data set on the open data portal.
1. Different feeds - If you compare the spreadsheet the vendor gets and the open data portal the public is required to use, you’ll see that the spreadsheet looks like a clear layman's form of the information while the public data set looks incredibly technical.
2. Different locations - The vendor receives an address and maps the crime at the actual address, then later removes the block number (using the SpotCrime convention of putting XX in replacement). So 110 Main St. becomes 1XX Main St.
The public Durham portal shows no address, and moves the point based on an algorithm to a position nearby. And, the public portal does not have the level or detail showing the type of residence (single family home, apartment, etc).
3. Delay - It looks like the data is delayed to the public portal by a few days, meaning the vendor is getting more up to date access.
4. Time stamps - Finally, the public portal has numerous fields reporting times with documentation listing that the time stamp is based on UK Time meridian, but we've found this to be wrong. The time is actually Durham time.
The Checklist
Below is the checklist for an awesome open data feed. If your police agency can check off every single box on the list below, give them a pat on the back!
☑Open - meaning the data has no restrictions on collecting, using, and sharing.
☑No preferential access given to others.
☑Up-to-Date. The information is updated on a timely basis (hourly, daily, or weekly).
☑Inclusive. All crime incidents that are considered public information are included. No omissions.
☑Machine readable. See SOCS for acceptable formats.
☑Location. Something that includes the ability to pinpoint (even if it’s a block address), but preferably lat/long coordinates
☑Date
☑Explanation of the data. Is the data from CAD or RMS? Is there a description that can be included i.e. residential burglary, description of a suspect, etc.
☑Identifier. A case number or call number to help identify the incident
☑Contact information. The dataset should include good contact information in case any one has questions about the data. Not only does this help clear up any confusion on the data, but it also opens up a line of communication with the agency and community.
Lesson to learn from Durham
If a vendor has access to crime data then the public, news agencies, and residents should have similar access. In the case of Durham NC, not only are there noticeable difference in the data, the vendor gets better and more timely information than the public portalThis forces citizens to use the vendor as there single timely crime data news source. Forcing the public to use one news source seems counter to open data and our democracy.
Comments