It’s difficult enough to teach people language, not to mention machines, and for this reason, language proficiency for computers is a moving target. People in the industry have struggled for years with the question of how to get more meaningful data into the hands of those who need it, such as businesses or organizations. Rob Key and Mark Garratt outline how advances are being made in predictive analysis that may better equip businesses to get ahead of changing tastes and opinions and to better predict trends to reduce innovation time. Let’s take a look at how developing technologies have led to and will continue to lead to exciting business applications of AI-powered NLP.
[Related Article: Problem Solving with Data for a Better Business]
Problems With Machine Learning
About 21% of unstructured data is actually processed, so separating true signals and information from what’s just noise is really difficult. Bad data creates unreliable reporting, and listening efforts fall far short of what more modern AI is capable of because of some significant challenges:
- disambiguation: words, meanings, and context are difficult to nail down through boolean structures. For example, Anne Hathaway causes Berkshire Hathaway stock to go up because of poor noise filtering.
- sarcasm and slang: understanding embedded meanings of difficult slang is notoriously difficult with rules-based approaches. These approaches typically achieve 60% accuracy versus the human gold standard (three different people in agreement).
- missing implicit meaning: rules-based approaches often miss implicit meaning when scraping for sentiment analysis.
- poor recall: doing document-level analysis misses key signals when searching for opinions. For example, this tweet could miss all the implicit opinions in favor of Verizon and deprecatory of Sprint.
Most businesses are stuck with basis annotation and can’t model much with the data. Recently, however, movement beyond these sorts of surface level sentiment analysis through data science platforms that allow experts to build on top of these existing platforms. This pushes through information that’s actually meaningful.
Moving away from rules-based approaches to machine learning has the potential to change the game entirely. No more black box performance or simply reactive models–instead, we get a predictive analysis that helps businesses begin to pivot well before peak mass for trends hits places like social media.
Active machine learning as a service does intend to change the way business interacts with unstructured data as difficult to pin down as language. Converseon, for example, offers a way to build models that achieve near-human levels of prediction, with sentiment analysis including the pitfalls listed above. Part of the appeal is the speed at which businesses can build new classifiers and categories that pertain to specific products or areas (A small cell phone is good, for example, but a small hotel room is not).
It replaces Booleans to achieve nearly 92% relevancy, which feeds into other types of variables in the data. Brand advocates can make ample use of these developments: instead of searching for concrete brand advocacy, you could measure more explicit measures of trust (“I use your product at night because it helps my baby sleep.”)
Despite the inherent complexity of the process, precise language classification is rapidly improving. This allows data experts to more closely monitor data without losing vast amounts of records previously lost through structured, rules-based approaches. Machine learning is beginning to outpace the traditional human gold standard.
But Can AI-Powered Social Listening Data Predict The Future?
This is the golden question. Does the use of machine learning offer better prediction results than the market itself? Companies need a lot of lead time to have a product that drops right at the peak of a trend. What we really want to know is if these predictions can be pushed early enough to allow companies to get to that point at the right time.
Back in 2014, professor Wendy Moe at the University of Maryland showed a definitive correlation between social media listening data cleaned up through these new accuracy tools and brand equity measurements derived from survey questions to the tune of an astounding .8. And that was several years ago.
So is there a way to prove this more definitely? The answer is absolutely yes.
Through a four-stage approach that builds classifiers specific to motives and using four years of data around the “foodie” conversation, the team was able to predict with very accurate results two very big trends.
Along with sentiment, adding motives helps develop the type of refinement needed for true prediction. This way, it’s actually useful for predicting trends well before they reach peak saturation. Getting a leading indicator out of this more attuned listening may help businesses speed up innovation.
Identifying the language around motives and applying sentiment around that allows validation for F1 scores and improve them over time. Instead of limited expressions, expanding beyond to related phrases opens up many signals you would have missed before with simple rules-based approaches. Basically, it’s a core idea (motive) and all the peripheral data associated with that motive.
So how do you do you combine the expert knowledge you need to develop proper training data sets and refine the posts to time series to drive success? Working internally to define what the organization means for certain motivations allows businesses to build taxonomic classifiers specific to the organization’s understanding of the world.
There has to be a high level of precision because there’s so much noise in social media. Better identifiers can identify true trends and not just vague or self-evident trends. Instead, predictions can model coming trends within a framework that can help businesses get ahead.
One significant example is Chobani Yogurt. The company created coolers because they couldn’t get on standard shelves, and when they got natural distribution, Greek yogurt was already underway as a trend. Dannon took the initiative and produced quite a few of its own renderings, and Chobani just happened to be ahead of the market. Had Dannon had six months of data, they may have made up that distance a lot sooner.
Using the four-step approach in their analysis, the group accurately predicted the trend of berries within the vegan market and the market at large. It was just as predictive as the sales trend itself. In fact, the two series are correlated, and social media listening was predictive; once the trend takes off, sales data predicts new social media talk in a circle defined as endogeneity.
Where Does This Take Us?
The researchers believe that this predictability can give businesses a much better chance of jumping out ahead of a trend, although they were quick to note that this may not work as well with items brand-new to the world. Rather, mining existing sales data plus social conversations can help businesses find the next big thing within established categories.
[Related Article: Why Do Businesses Need a Data (and AI) Manifesto?]