5

Downloading Police Employment Trends from the FBI Data Explorer

 1 year ago
source link: https://andrewpwheeler.com/2023/07/29/downloading-police-employment-trends-from-the-fbi-data-explorer/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Downloading Police Employment Trends from the FBI Data Explorer

The other day on the IACA forums, an analyst asked about comparing her agencies per-capita rate for sworn/non-sworn compared to other agencies. This is data available via the FBI’s Crime Data Explorer. Specifically they have released a dataset of employment rates, broken down by various agencies, over time.

The Crime Data Explorer to me is a bit difficult to navigate, so this post is going to show using the API to query the data in python (maybe it is easier to get via direct downloads, I am not sure). So first, go to that link above and sign up for a free API key.

Now, in python, first the API works via asking for a specific agencies ORI, as well as date ranges. (You can do a query for national and overall state as well, but I would rarely want those levels of aggregation.) So first we are just going to grab all of the agencies across 50 states. This runs fairly fast, only takes a few minutes:

import pandas as pd
import requests

key = 'Insert your key here'

state_list = ("AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "FL", "GA",
              "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD",
              "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ",
              "NM", "NY", "NC", "ND", "OH", "OK", "OR", "PA", "RI",
              "SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY")

# Looping over states, getting all of the ORIs
fin_data = []
for s in states:
    url = f'https://api.usa.gov/crime/fbi/cde/agency/byStateAbbr/{s}?API_KEY={key}'
    data = requests.get(url)
    fin_data.append(pd.DataFrame(data.json()))

agency = pd.concat(fin_data,axis=0).reset_index(drop=True)

And the agency dataframe has just a few shy of 19k ORI’s listed. Unfortunately this does not have much else associated with the agencies (such as the most recent population). It would be nice if this list had population counts (so if you just wanted to compare yourself to other similar size agencies), but alas it does not. So the second part here – scraping all 18,000+ agencies, takes a bit (let it run overnight).

# Now grabbing the full employment data
ystart = 1960   # some have data going back to 1960
yend = 2022
emp_data = []

# try/catch, as some of these can fail
for i,o in enumerate(agency['ori']):
    print(f'Getting agency {i+1} out of {agency.shape}')
    url = ('https://api.usa.gov/crime/fbi/cde/pe/agency/'
          f'{o}/byYearRange?from={ystart}&to={yend}&API_KEY={key}')
    try:
        data = requests.get(url)
        emp_data.append(pd.DataFrame(data.json()))
    except:
        print(f'Failed to query {o}')

emp_pd = pd.concat(emp_data).reset_index(drop=True)
emp_pd.to_csv('EmployeePoliceData.csv',index=False)

And that will get you 100% of the employee data on the FBI data explorer, including data for 2022.

To plug my consulting firm here, this is something that takes a bit of work. If you have longer running scraping jobs, I paired this code example down to be quite minimial, but you want to periodically save results and have the code be able to run from the last save point. So if you scrape 1000 agencies, your internet goes out, you don’t want to have to start from 0, you want to start from the last point you left off.

If that is something you need, it makes sense to send me an email to see if I can help. For that and more, check out my website, crimede-coder.com:

CrimeDeCoder_Logo.PNG

Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK