As privacy concerns escalate in the age of big data, developers constantly evolve artificial intelligence and machine learning techniques to protect individuals’ identities.
Machine learning systems enable businesses to more effectively identify fraud and keep user information safe. These systems gather data that can provide much more powerful insights than individual review. Security Magazine, for instance, used machine learning to reveal iPhone 5s users were seven times more likely to commit fraud than iPhone 8 users.
At the same time, researchers constantly evolve machine learning methods to better protect the privacy of individual training cases. Here’s an overview of three prevalent machine learning practices that keep your identity safe.
User Behavior Analytics
Logging onto a Gmail account from a new device prompts a variety of security measures. Google will email an alert about the login or a text a special code to the account’s associated phone number.
This is part of its user behavior analytics system — an AI system that learns how an individual or group normally accesses its services and uses that information to confirm user identity in later authentications. Behavior analytics is a common security strategy.
Machine learning-empowered security systems collect and interpret information including location, time, and keystroke dynamics associated with the authentication. When any of these stray from baseline behavior, the security system launches into action.
This strategy is employed widely, including at Amazon Web Services. Its security service, Macie, employs a user behavior analytics engine to automatically discover, classify, and protect sensitive data. The engine detects sudden increases in high-risk or unusual API activity and activity at infrequent hours, among other things.
RSA Security also utilizes user behavior analytics, as a blog post by the company’s former product management consultant highlights. The company also prioritizes “forgetting” old or obsolete behaviors in the algorithms, like when a user permanently relocates. This is an important optimization step in any user behavior analysis.
Biometric authentication technology is also advancing to ensure users are the only one who can access their devices and accounts. It’s difficult and costly for hackers to penetrate complex biometric walls, as software engineer and technology blogger Ben Dickson wrote for The Next Web.
Apple’s new iPhone X, for instance, contains AI-informed facial identification authentication software. Rather than comparing the user’s face to still images, it creates a facial model with its infrared camera and built-in neural network processor.
Unlike previous facial scan authentication software, the new technology works in varied lighting conditions and can detect if the user awake and aware. It even uses advanced machine learning for adaptive recognition, so it still recognizes a user when they wear a hat, put on glasses, or change their hair.
“Some of our most sophisticated technologies — the TrueDepth camera system, the Secure Enclave, and the Neural Engine — make it the most secure facial authentication ever in a smartphone,” Apple’s product page reads.
Differential Privacy Framework
In another vein, researchers are working to build and improve machine learning frameworks that protect the identities of individuals whose data is used to train models. Protective frameworks don’t just benefit users’ privacy needs, but also prevent the overfitting that hurts a model’s ability to generalize. Thus increased privacy in machine learning can lead to more useful models.
The differential privacy framework, for example, measures how much privacy an algorithm guarantees, as two machine learning experts at Google Brain described. An algorithm achieves differential privacy if the answers it produces can’t be distinguished based on training examples. In other words, the algorithm could have made the same conclusion without any given data point.
Anyone can utilize the differential privacy framework through the Private Aggregation of Teacher Ensembles (PATE) algorithm family, the experts said. The PATE framework operates similarly to any other supervised machine learning model, but the resulting model includes privacy guarantees.
To develop a PATE learning model, researchers must separate training data into chunks, train them individually, and aggregate their predictions. But they must add noise from Laplace or Gaussian distribution before they come to a single prediction. The post by the Google Brain scholars goes into more detail on this process. But what’s important is that it surpasses other privacy approaches, like k-anonymity, and overcomes their constraints.
Machine learning maintains a strong barrier between private information and hackers trying to access it. Frameworks of the future will better protect the data used in training, as well. This increased privacy will only serve to improve the field of machine learning and data science as a whole.