Crime Incidents

This file reflects crimes reported to the City of Portland Police Bureau. Classification of the crime type is based on the Uniform Crime Reporting (UCR) system developed by the FBI and used by law enforcement agencies throughout the United States. Only the last 12 months of data will be available from the given date of download.

Comments

yourmapper's picture

Great data! It would be nice to have some addition breakdown of the data, like:
<br><br>
1 - Unique case number so developers can identify each crime and update existing data when updates are made, rather than try to match cases across all other data fields
<br><br>
2 - X and Y GIS coordinates converted to two additional Latitude and Longitude coordinate fields
<br><br>
3 - Address field split into Address, City, State, and Zip fields
<br><br>
4 - URC Code number for the crime types in addition to the text descriptions

Submitted by yourmapper on March 26, 2010 - 6:02am
Caged's picture

<p>I second the need for a unique ID or case #. Right now duplicates are very hard to determine. I attempted to generate unique IDs for the crimes in this dataset and I came up with about 1000+ duplicates. The SHA was generated by combining the date, major offense, address and x and y coordinates. Those combined should be very unique.</p>

<p>My concern was that the so called duplicates might not actually be duplicates. For instance, what if two people were charged with the same crime? I assume it's very common for one or more people to be involved in a crime. Does it generate two entries in this dataset if two people are involved in committing a single crime? My guess is that it doesn't (two people involved in one homicide is still one homicide). It's hard to tell without a unique ID.</p>

<p>To expand on the point further, if you're automating the loading of this dataset on a daily basis it's very important that you can accurately tell which crimes you've already imported otherwise you end up with very inaccurate data which isn't very useful to the developer or public at large using an application built on this data.</p>

<p>Finally, there is a misspelling in the dataset. Y Coordiante should actually be Y Coordinate. I contacted the email address listed in the metadata about a week ago, but haven't received a response.</p>

<p>Thanks for releasing this data. It's a great first step!</p>

Submitted by Caged on September 29, 2010 - 9:19pm
rnixon's picture

Thank you both for the feedback.<br><br>

The spelling error to the column header has been corrected. What was “Y Coordiante” is now "Y Coordinate".<br><br>

Here are the potential impacts:<br><br>

* For anyone that has already built a working DTS\SSIS package, you will have to update the package as the column name no longer matches.<br>
* Any programming reference to the previous column name “Y Coordiante” in your application will have to change as well.<br><br>

With regard to the need for unique IDs, we will make that change in the coming days, and will announce when that is in effect as well.

Submitted by rnixon on October 1, 2010 - 8:15am
Caged's picture

<p>@rnixon: Thanks for the quick response! It's very encouraging to know that you are listening to feedback and making the suggested improvements. With that in mind, it makes it very easy for me to get back to work on my project.</p>

<p>Kudos to everyone involved in releasing this data!</p>

<p>@yourmapper: You can easily convert the State Planar coordinates in this file to WGS 84 lat/lon coordinates using a project like GDAL. Specifically you're looking for gdaltransform - http://www.gdal.org/gdaltransform.html. Proj4js is also capable of making these conversions - http://proj4js.org.</p>

Submitted by Caged on October 1, 2010 - 1:55pm
rnixon's picture

We've now added a new field “Record ID” at the beginning. It’s an 8-digit unique record identifier for each offense/incident. It’s consistent in all the files out there for the same offense/incident. The changes have been applied to all files and metadata documentation.

Submitted by rnixon on October 12, 2010 - 3:57pm
Caged's picture

<p>I'm getting deeper into this data over at http://portlandcrime.com and I'd like to map neighborhood boundaries and gather statistics for crimes in different neighborhoods.</p>

<p>There is another dataset[1] that has Neighborhood boundaries which is great. The problem is that the naming doesn't match up and there is no unique id assigned to a neighborhood so you could easily match the correct neighborhood regardless of how it was spelled.</p>

<p>A few examples of how the datasets compare (Crime Incidents left, Neighborhoods right):</p>

<ul>
<li>POWELHST-GILBRT vs. POWELLHURST-GILBERT</li>
<li>CHINA/OLD TOWN vs. OLD TOWN/CHINATOWN</li>
<li>BEAUMONT-WILSHR vs. BEAUMONT-WILSHIRE</li>
<li>BRENTWD-DARLNGT vs. BRENTWOOD-DARLINGTON</li>
</ul>

<p>I propose assigning neighborhoods a unique ID and including the identical unique id in both the Crime Incidents dataset and the Neighborhoods dataset. This would allow you to accurately map crimes to neighborhoods regardless of the spelling differences between the datasets.</p>

<p>Another small request would be to stop putting everything in uppercase if it's possible. It's very easy to get MCDONALDS from McDonalds, but it's not easy to do the reverse without your program being aware of the rules of english language.</p>

<p>Thanks a lot!</p>

<p>1: http://www.portlandonline.com/cgis/metadata/viewer/display.cfm?Meta_laye...

Submitted by Caged on November 9, 2010 - 6:18pm
gangsa's picture

It is great to see that civicapps has actually created a different sector for public related crime incidents. The data base is totally amazing with all the details embroidered within the topic. But the only case I didn't see is about tax problems related incidents and crimes committed by people. Which is actually literally growing day by day rather than other physical crime incidents. So, I think that civicapps should even think about the content related to tax problems and discuss over it.

Submitted by gangsa on May 9, 2011 - 10:23pm