The CIA and other Security Agencies are using Machine Learning to Predict Crime

Crime is a social problem. It is therefore not the kind you might expect machine learning to solve, or even improve. Unlike the many problems that machine learning has already cracked, social problems tend to include more anomalies, inconsistencies, and unexplained results.


However, most crime – from the perpetrators’ perspective –  is planned and evaluated to some extent. It is this planning that contains the logic which gives rise to the patterns in the world of crime. In order for machine learning methods to make successful predictions, these patterns are vital. A criminal will choose an optimal time for the specific crime, an optimal location, and of course his or her own details such as age, sex, substance abuse, education, criminal record, and so on, are relevant. When looked at on a larger scale, there is an amazing amount of consistency with regards to crime.


There have been many claims about the CIA using machine learning to predict terror threats, big time cyber attacks, and more. The truth is, not surprisingly, no one really knows the specifics, or to what extent it employs it. There is no doubt, though, that machine learning methods are being used to a greater or lesser degree, because the CIA does work closely with companies such as Cylance and Palantir, both of which exist entirely to understand and predict crime and do so by means of machine learning. Neither of them publicly go into detail about exactly how they operate.


Predicting crime is famously done by the company PredPol, whose tools are now used by over fifty police departments in America.  Unfortunately, the reality of these tools are slightly different to those portrayed in movies. There are some key limitations, which are understandably difficult to surpass.


“Using only three data points – crime type, crime location and crime date/time – PredPol’s powerful software provides each law enforcement agency with customized crime predictions for the places and times that crimes are most likely to occur. PredPol pinpoints small areas, depicted in 500 feet by 500 feet boxes on maps – that are automatically generated for each shift of each day.”


First, the predictions do not predict who is going to commit a crime, only where it is likely to occur. Second, if the crime in question is on the more serious end of the spectrum, five hundred feet squared will probably not be precise enough to save any victim. But it is definitely better than nothing and, regardless of the limitations, the results are very impressive. Not only does PredPol predict 30% more accurately than crime specialists, whose jobs were to do exactly what PredPol now do better, but PredPol has actually made a difference in what police set out to do, namely, to decrease crime. Certain departments have seen a decrease in actual crime to the tune of 32%, with an average figure of about 14%. This is because police have been more successful in finding and catching criminals. PredPol’s big picture goal is being met.


The quality of the data required to learn and predict crime is vital. It is generally difficult to define what data is relevant to a specific social problem, but it is crucial to supply as much information as possible without overloading the system with completely irrelevant data. These days, police collect so much data in so many different places and forms, that people end up questioning their privacy. From public CCTV footage to in-depth social media analysis, data is no longer manually collected. Over the next fifteen years, it is said that this data is all going to form part of the prediction of crime. Although more powerful and complex machine learning methods have not yet been properly implemented within this industry, there is no doubt that they will be part of the future, and crime will become increasingly predictable as a result of them.


However, it is also true that at the same time, crime is becoming ever more complex…