AI for Good: Bad Guys, Messy Data, & NLP AI for Good: Bad Guys, Messy Data, & NLP
Artificial intelligence is typically viewed as a tool to help businesses propel themselves in the current digital economy. But the applications of AI are... AI for Good: Bad Guys, Messy Data, & NLP

Artificial intelligence is typically viewed as a tool to help businesses propel themselves in the current digital economy. But the applications of AI are so wide that it can’t be boiled down to a ploy to gain capital. In fact, some seek to use AI and machine learning to provide solutions for the betterment of society.

During ODSC East 2019, Chris Mack, Vice President of Text Analytics at Basis Technologies, presented a variety of situations for which natural language processing can be used for good. Mack began his presentation by detailing the foiled transatlantic liquid bomb plot from 2006. U.S. and British intelligence was able to detect plans for the bombing with the help of artificial intelligence technologies.

[Related article: Intro to Language Processing with the NLTK]

Not all situations like the liquid bomb plot pan out this way, unfortunately. However, Mack says that we can use those heroic tactics as a blueprint to help prevent other terrorist attacks and crimes from flying under the radar.

According to a study Mack cited in his presentation, terrorist efforts have claimed the lives of just over 11,000 civilians in Europe alone from 1979 to 2017. Mack also said only one percent of all illegal funds and money laundering operations are regularly seized by law enforcement teams.

So how can we make AI a better tool for preventing terrorist attacks and syndicated crime? Mack says there needs to be a higher emphasis on the use of natural language processing in investigations. Additionally, that technology has to keep innovating and handling cases that can be easily routed.

For example, when data undergoes natural language processing, the data is typically scored using keywords and naive rules. Sometimes the language used to relay potential attacks is much more nuanced than these rules are looking to find, and can lead models to find false positives or negatives.

“With those rule-based systems, it’s impossible to process every word from a language properly, so it’s a good starting place, but you’re going to run into some problems,” Mack said.

Most models also fail to take into account variants in data, such as different spellings of names or words that have different roots. This could lead to noisy data, which might also have flags that a machine learning model might miss. Mack proposes a cross-lingual semantic model that can handle these cases as though you’re working in one language.

[Related article: Artificial Intelligence and Machine Learning in Practice: Anomaly Detection in Army ERP Data]

“We created cross-lingual embeddings and domain-specific embeddings that allow us to say, ‘not only is this the same topic, but it’s the same topic in two different languages,’” Mack said.

With natural language processing advancing this rapidly, it’ll take some time for international law enforcement agencies to place trust in the technology. However, if there’s anything we can learn from the liquid bomb plot, it’s that catching an attack early should be a priority, and AI may be the future of intelligence.

Kailen Santos

Kailen Santos

I’m a freelance data journalist based in Boston, MA. Formally trained in both data science and journalism at Boston University, I aspire to make working with data easy and fun. If you work in a newsroom or if you’re just data-curious, I hope to help you explore data clearly. https://www.kailenjsantos.wordpress.com/

1