The History of Neural Networks and AI: Part III The History of Neural Networks and AI: Part III
This article is the third and final article in a three-part series about the history of neural networks and artificial intelligence.... The History of Neural Networks and AI: Part III

This article is the third and final article in a three-part series about the history of neural networks and artificial intelligence. To view the first article that dives into the earliest developments of artificial intelligence, click here. For a better picture of how neural networks and artificial intelligence technologies scaled through the 1960s and 1970s, check out the second part here.  

The history of neural networks and AI can reveal quite a lot about today’s technological environments.Although the earliest developments in neural networks and artificial intelligence occurred over 60 years ago, don’t discount their long-term impact and influence on today’s advancements in the fields. In this final section on the history of neural networks and AI, I’ll go through some of the most significant developments and technologies that have occurred since the late 1980s and early 1990s through the 2000s.

IBM’s Deep Blue: A Test of Man versus Machine in Chess

Kasparov Magath 1985 Hamburg

Garry Kasparov in competition, Hamburg, DE – 1985 Source: By GFHund [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC BY 3.0 (https://creativecommons.org/licenses/by/3.0)], from Wikimedia Commons

Having discussed MENACE, one of the first intelligent models to play a human game, in part II of this series, the next big breakthrough came in the form of IBM’s
Deep Blue in 1997. Deep Blue was originally derived from a chess-playing machine that started as the dissertation project of two Carnegie Mellon graduate students in 1985 known as the ChipTest project. Both graduate students went on to work at IBM Research and continued their work from their graduate days in enhancing the chess-playing machine’s capabilities. IBM eventually named their efforts “Deep Blue.” It must be noted that Deep Blue was specifically designed to play chess against an opponent as its algorithm could essentially search thousands of chess moves per second, weighing the best options and then executing.

Deep Blue won its first chess game on February 10th, 1996 against a chess world champion, Garry Kasparov, in the first game of a six-round game series. Kasparov went on to win the series 4-2. Based on the outcome of the 1996 chess game, IBM Deep Blue made dramatic improvements to Deep Blue and set a rematch against Kasparov in May 1997. Deep Blue won game six of this series thus beating Kasparov overall with a score of 3 ½ – 2 ½ . Machines had been capable of beating humans in chess before, but Deep Blue represented the first system to beat a world chess champion in a standard chess competition environment.

One result of the May 1997 match outcome was that Kasparov believed IBM was cheating during the match because Deep Blue made certain moves that he considered too creative for a machine, or by sheer brute force. Kasparov demanded a rematch, but IBM did not cooperate with these allegations. IBM refused to publish the machine’s log files until much later and denied Kasparov of a rematch. After the 1997 match, IBM retired Deep Blue.

MNIST: A Benchmark for Classification Algorithms

MNIST Handwritten Numbers used for image processing advancements

MNIST Handwritten Numbers used for image processing advancements Source: By Josef Steppan – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=64810040

Soon after IBM’s achievements with Deep Blue, in 1999, one of the most famous datasets was published, the Modified National Institute of Standards and Technology dataset or the MNIST dataset for short. The MNIST dataset is a collection of tens of thousands of handwritten digits, compiled from high school students and American Census Bureau employees.

This dataset not only stands as one of the most commonly used datasets to train image processing systems but is also a famous educational sample dataset that is often used within the machine learning community. Many online machine learning tutorials use the MNIST dataset as an example of machine learning capabilities. It has additionally been the source of many online competitions in which researchers try to produce the best possible accuracy when recognising digits in the various image processing systems they test.

Several groups have published machine learning papers that discuss potential ways to achieve the lowest error rate and get the best results using the MNIST dataset. In 2013, the lowest error rate recorded thus far on the MNIST dataset was achieved. Split between 60,000 training samples and 10,000 testing samples, the best result was achieved by Li Wan, Matthew Zeiler, Sixin Zhang, Yann LeCun, and Rob Fergus using a method they called DropConnect. The method, based on G. E. Hinton’s paper about Dropout, achieved an error rate of 0.21% (out of the 10,000 samples).

MNIST continues to act as a foundational database for training image processing systems, and the field is consistently enhanced with newer iterations of the dataset. In 2017, an extended dataset of MNIST called EMNIST was released, containing 240,000 training digit images and 40,000 testing digit images.

Further Improvements to Computer Vision: Facebook’s DeepFace

Following in MNIST’s steps came Facebook’s efforts in image processing. Facebook’s release of its face verification system, DeepFace, significantly improved upon computer vision programming capabilities. DeepFace is 97.35% accurate in human face recognition which is considered an incredibly high rate of accuracy in the field. The reason it achieved such a high rate of accuracy is partly due to Facebook’s obvious and massive access to images of people’s faces based on images uploaded to the social networking platform. In order to test DeepFace’s capabilities, the Facebook team trained the 9-layer deep neural network model on a dataset of 4 million facial images belonging to 4,000 individuals. The model contained over 120 million weights using several locally connected layers rather than standard convolutional layers.

Before DeepFace, the FBI’s Next Generation Identification system was considered the most accurate in the field at an 85% accuracy rate. DeepFace’s original algorithm was actually acquired via the Facebook acquisition of Face.com in 2007. Due to the high rate of accuracy attained through the DeepFace model, the problem of face recognition has been essentially considered solved.

Today’s Notable Advancements in Machine Learning and AI

Starting in 2004, there has been a boom in product development and creations that implement machine learning and artificial intelligence in ways that we are familiar with (e.g. voice assistant, image recognition, etc.).

A few key examples to note:

  • Google’s self driving car (Waymo) launched in 2009, using neural network models to interpret and make sense of 3D imaging of surroundings and driving conditions.
  • Apple’s voice-controlled software, Siri, was introduced in 2011. Siri changed how we can interact with our electronics, phones and computers alike, because it provides technological solutions to the actual problems of its users. Unlike earlier voice-controlled software, Siri entered the marketplace with a deep understanding of its users’ device needs and deployed AI and speech solutions that aided users in fulfilling these needs. Siri was not technology deployed for technology’s sake; it was deployed for the user’s sake.
  • Google DeepMind made headlines in 2016 when its AlphaGo interface beat a professional Go player for the first time in recorded history. Based on reinforcement learning techniques, DeepMind’s various programs seek to “solve intelligence” and implement algorithms that foster understanding via experience.

Machine learning is still growing at a drastic speed as the history of neural networks and AI shows, but some think this may slow down. One of the leading individuals in this school of thought is Gary Marcus, who wrote ‘Deep Learning: A Critical Appraisal’, which outlines the possible walls that machine learning may hit in the future. Despite this potential future, Marcus joins the rest of the data science community in its excitement to see what’s next in the fields of machine learning and artificial intelligence.

Since the core beginning in the 1940s, the history of neural networks and artificial intelligence continue to grow in their influence and exponential impact on various aspects of society, extending from self-driving cars to “smart” voice assistants. Life as we know it today would not be the same without these dynamic technologies. On a daily basis, history is made in the AI field due to the different research projects and developments flourishing throughout the world. When we review the history of AI once again in the near future, the course of development will only have accelerated further. Where to? That question has yet to be answered. The possibilities are endless.


Caspar Wylie, ODSC

Caspar Wylie, ODSC

My name is Caspar Wylie, and I have been passionately computer programming for as long as I can remember. I am currently a teenager, 17, and have taught myself to write code with initial help from an employee at Google in Mountain View California, who truly motivated me. I program everyday and am always putting new ideas into perspective. I try to keep a good balance between jobs and personal projects in order to advance my research and understanding. My interest in computers started with very basic electronic engineering when I was only 6, before I then moved on to software development at the age of about 8. Since, I have experimented with many different areas of computing, from web security to computer vision.