New paper: An Open Source Replication of a Winning Recidivism Prediction Model

Our paper on the NIJ forecasting competition (Gio Circo is the first author), is now out online first in the International Journal of Offender Therapy and Comparative Criminology (Circo & Wheeler, 2022). (Eventually it will be in special issue on replications and open science organized by Chad Posick, Michael Rocque, and Eric Connolly.)

AL9nZEXcpDVfFANeKFriUN8LFSh86THjO8Qh0eFxB_8W48EPkAOLg5Lx6IOYKsMQLJjLoDDs3ywECUfZUHXcNnCp2QcOnGgHXnl9-lCXXcN6IL_Jp-gjptFRP1YqrWup8xWG_OUCf7mwiBq6kkgr4YwdkeQy=w1098-h570-no?authuser=0

We ended up doing the same type of biasing as did Mohler and Porter (2022) to ensure fairness constraints. Essentially we biased results to say no one was high risk, and this resulted in “fair” predictions. With fairness constraints or penalities you sometimes have to be careful what you wish for. And because not enough students signed up, me and Gio had more winnings distributed to the fairness competition (although we did quite well in round 2 competition even with the biasing).

So while that paper is locked down, we have the NIJ tech paper on CrimRXiv, and our ugly code on github. But you can always email for a copy of the actual published paper as well.

Of course since not an academic anymore, I am not uber focused on potential future work. I would like to learn more about survival type machine learning forecasts and apply it to recidivism data (instead of doing discrete 1,2,3 year predictions). But my experience is the machine learning models need very large datasets, even the 20k rows here are on the fringe where regression are close to equivalent to non-linear and tree based models.

Another potential application is simple models. Cynthia Rudin has quite a bit of recent work on interpretable trees for this (e.g. Liu et al. 2022), and my linked post has examples for simple regression weights. I suspect the simple regression weights will work reasonably well for this data. Likely not well enough to place on the scoreboard of the competition, but well enough in practice they would be totally reasonable to swap out due to the simpler results (Wheeler et al., 2019).

But for this paper, the main takeaway me and Gio want to tell folks is to create a (good) model using open source data is totally within the capabilities of PhD criminal justice researchers and data scientists working for these state agencies.They are quantitaive skills I wish more students within our field would pursue, as it makes it easier for me to hire you as a data scientist!

References

Circo, G. M., & Wheeler, A. P. (2022). An Open Source Replication of a Winning Recidivism Prediction Model. International Journal of Offender Therapy and Comparative Criminology, Online First.
Liu, J., Zhong, C., Li, B., Seltzer, M., & Rudin, C. (2022). FasterRisk: Fast and Accurate Interpretable Risk Scores. arXiv preprint.
Mohler G., Porter M.D. (2021). A note on the multiplicative fairness score in the NIJ recidivism forecasting challenge. Crime Science, 10, 17.
Wheeler, A. P., Worden, R. E., & Silver, J. R. (2019). The accuracy of the violent offender identification directive tool to predict future gun violence. Criminal Justice and Behavior, 46(5), 770-788.

New paper: An Open Source Replication of a Winning Recidivism Prediction Model

New paper: An Open Source Replication of a Winning Recidivism Prediction Model

References

Recommend

The Dodge Circuit EV Is The Coolest Electric Car We Wish Made Production

京东20点苹果全系大额券：iPhone 13仅4599元还有1060元超级补贴

开心汽车和茂林斯达正式签署并购协议

Chip Industry's Technical Paper Roundup: Nov. 1

Why we need to improve cloud computing’s security

Component-First Architecture with Standalone Components and Nx

Robots? Some Companies Find Only Humans Can Do the Job

Samsung Galaxy Features You're Missing Out On

慢雾：pGALA 事件根本原因系私钥明文在 GitHub 泄露

Find All You Need For AI Art | aiart[apps]

About Joyk