AI and the Visual Revolution AI and the Visual Revolution
We’ve focused a lot on what AI can do for language processing, but one aspect of AI that’s gaining speed is visual. So much... AI and the Visual Revolution

We’ve focused a lot on what AI can do for language processing, but one aspect of AI that’s gaining speed is visual. So much of our data is visual, so using these AI models to parse and work with visual media is a crucial aspect of decoding our big data.

Businesses, research organizations, and a host of other institutions will need the capability to handle visual media in data science and research. We’ll also be able to use it to make work training safer. We may even be able to enhance fraud detection and build a system of checks and balances between surveillance and personal freedom. Let’s take a look at some ways that AI will rock visuals in the next few years.

[Related article: Six Big Companies That Use Visual Search]

Image Indexing

AI models are data hungry, and a lot of that data is visual. Right now, labeling images and videos requires substantial human input upfront, making it harder to find open source training sets for things like facial recognition.

AI could lighten the load for image indexing. Software working on creating visuals from text descriptions and vice versa could build better training sets and ensure that there is plenty of input for data-hungry deep learning models.

Another benefit of AI-powered indexing is our deployment of things like data lakes. Building a lake is only as good as our search capabilities, but preventing it from turning into a swamp requires an enormous commitment from your human talent. Instead of sucking the life out of your data engineer, using AI to index visual media into a logical, searchable repository could make maintaining the data lake far more realistic.

Both of these capabilities can be done in real time, reducing the backlog of unlabeled data. AI is the only option capable of indexing the amount of data we produce daily, so thank the tech gods, i.e., your friendly developers, that this capability is on the horizon.

Augmented and Virtual Reality

Training is dangerous, but inexperience is also dangerous. How do you get around this catch-22 in fields where the job and the training are both, well, dangerous? Augmented and virtual reality offers the type of experience workers could need to gain the type of experience that protects from hazards without being exposed to those hazards prematurely.

Virtual reality has come a long way since those cumbersome early days. Augmented reality may even boost that experience by providing real-time feedback in the actual environment while simulating sections of hazard or danger that could be potentially catastrophic.

Other types of training, such as surgery or medical procedures could also benefit from augmented reality. Having a blend of physical stimuli with computer-generated stimuli may create a more realistic training option that doesn’t put anyone in danger but offers a whole lot more realism.

Virtual reality could also offer the chance to run simulations through several different scenarios before launching the real initiative. In military operations especially, another use of augmented reality is translating things like thermal sensing, night vision, and other environmental data into equipment that provides better feedback with better integration. No more Stepbrothers night vision goggles.


Visual Analytics

Now that we have doorbells that can recognize someone’s identity and greet them, we’re on the cusp of being able to read things like sentiment in real time. When customers come through the doors of a brick and mortar business, we may be able to measure their sentiment upon entrance and as they leave, providing businesses insight never possible before.

Other applications could be managing traffic flows with better analytics of both cars and pedestrians, assessing the efficacy and safety of our end to end production lines, or managing security and fraud in individualized contexts. A high profile case of a machine designed to scrub the web for human trafficking incorrectly identifying sand dunes as naked bodies could be a thing of the past.

Analytics allows us to unlock vast amounts of unstructured data, not only for indexing purposes but also for automating tasks humans can’t or don’t want to do. No one wants to watch hours of footage looking for snuff films (Youtube) or racist imagery (Facebook) or child pornography (law enforcement). AI never gets tired and doesn’t feel the psychological effects of long term exposure to such stimuli. It’s a way to relieve humans of some of that burden while maintaining vigilance.

[Related article: Integrating Textual and Visual Information Into a Powerful Visual Search Engine]

Next-Gen AI is Visual

Media is an untapped reservoir of unstructured information, but we’re chipping away at it. As our machines gain the capability to really “see” what’s happening, we’ll unlock better applications for indexing and analytics, allowing us to train our machines more efficiently and gain more valuable insights.

The capabilities for augmented reality will also ensure that what humans still need to do remains safer and that our training models are better. Looking toward the future, we’re going to see leaps and bounds not only with language but human level functioning in the realm of visual media. Thanks to advances in virtual “eyes,” we’ll be reaping the benefits and building better models for sure.

Elizabeth Wallace

Elizabeth Wallace, ODSC

Elizabeth is a Nashville-based freelance writer with a soft spot for startups. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain - clearly - what it is they do. Connect with her on LinkedIn here: https://www.linkedin.com/in/elizabethawallace/