It seems an artificial intelligence program created by DeepMind has discovered a new method of improving matrix multiplication. This could, in turn, allow for a boost in speeds upwards of twenty percent. Though twenty percent might sound like a big leap to some, in the world of efficiency in computing tasks, a small improvement could scale, bringing significant performance gains, or even energy savings as less power would be used to compute.
The new method was discovered by DeepMind’s AI -AlphaTensor. It was presented with an interesting problem involving the Strassen algorithm. This algorithm, created by Volker Strassen, has been a half-century workaround allowing for seven matrix multiplication, allowing it to be the most efficient approach to most matrix sizes. Though over the years there have been improvements to the method, they have not been easily adapted into computer code. But with AlphaTensor, its new method can run on current hardware.
DeepMind introduced this problem to its ai program without any solutions. It was tasked with creating a working algorithm that could complete the same task as Strassen’s algorithm but with the least amount of steps. The program found an algorithm for multiplying two matrices of four rows of four numbers. It used only 47 multiplications, which in turn out performs’ Strassen’s forty-nine matrix multiplication. But it didn’t stop there. The program went further and was able to develop new techniques for multiplying matrices of other sizes.
DeepMind’s Hussein Fawzi believes that AlphaTensor’s math is sound but at this moment is unsure how AlphaTensor came together to create this algorithm, “We don’t really know why the system came up with this, essentially… Why is it the best way of multiplying matrices? It’s unclear…Somehow, the neural networks get an intuition of what looks good and what looks bad. I honestly can’t tell you exactly how that works. I think there is some theoretical work to be done there on how exactly deep learning manages to do these kinds of things.”
But it’s clear that there is a lot of potential in this discovery, especially for hardware and software being used to ask cutting-edge questions that use supercomputers. James Knight of the University of Sussex made the case, “If this type of approach was actually implemented there, then it could be a sort of universal speed-up…If Nvidia implemented this in their CUDA library [a tool that allows GPUs to work together], it would knock some percentage off most deep-learning workloads, I’d say.”
It seems that AlphaTensor is unlocking new possibilities for supercomputers and their programs to not only become even more powerful, but help researchers unlock secrets that have taken decades in less time. Tools such as the Event Horizon Telescope ship data in hard drives by the ton as seen when it was used to create the first image of a black hole, so going through the data and cleaning it up requires a considerable amount of computing power. An extra twenty percent could shave not only years but possibly opens the doors to other fundamental questions researchers may have yet stumbled upon.