A New Approach to Computation Reimagines Artificial Intelligence
Source:https://www.quantamagazine.org/a-new-approach-to-computation-reimagines-artificial-intelligence-20230413/#comments A New Approach to Computation Reimagines Artificial Intelligence 2023-04-17 21:58:04

The paper built upon work done in the mid-1990s by Kanerva and Tony Plate, at the time a doctoral student with Geoff Hinton at the University of Toronto. The two independently developed the algebra for manipulating hypervectors and hinted at its usefulness for high-dimensional computing.

Given our hypervectors for shapes and colors, the system developed by Kanerva and Plate shows us how to manipulate them using certain mathematical operations. Those actions correspond to ways of symbolically manipulating concepts.

The first operation is multiplication. This is a way of combining ideas. For example, multiplying the vector SHAPE with the vector CIRCLE binds the two into a representation of the idea “SHAPE is CIRCLE.” This new “bound” vector is nearly orthogonal to both SHAPE and CIRCLE. And the individual components are recoverable — an important feature if you want to extract information from bound vectors. Given a bound vector that represents your Volkswagen, you can unbind and retrieve the vector for its color: PURPLE.

The second operation, addition, creates a new vector that represents what’s called a superposition of concepts. For example, you can take two bound vectors, “SHAPE is CIRCLE” and “COLOR is RED,” and add them together to create a vector that represents a circular shape that is red in color. Again, the superposed vector can be decomposed into its constituents.

The third operation is permutation; it involves rearranging the individual elements of the vectors. For example, if you have a three-dimensional vector with values labeled x, y and z, permutation might move the value of x to y, y to z, and z to x. “Permutation allows you to build structure,” Kanerva said. “It allows you to deal with sequences, things that happen one after another.” Consider two events, represented by the hypervectors A and B. We can superpose them into one vector, but that would destroy information about the order of events. Combining addition with permutation preserves the order; the events can be retrieved in order by reversing the operations.

Together, these three operations proved enough to create a formal algebra of hypervectors that allowed for symbolic reasoning. But many researchers were slow to grasp the potential of hyperdimensional computing, including Olshausen. “It just didn’t sink in,” he said.

Harnessing the Power

In 2015, a student of Olshausen’s named Eric Weiss demonstrated one aspect of hyperdimensional computing’s unique abilities. Weiss figured out how to represent a complex image as a single hyperdimensional vector that contains information about all the objects in the image, including their properties, such as colors, positions and sizes.

“I practically fell out of my chair,” Olshausen said. “All of a sudden the lightbulb went on.”

Soon more teams began developing hyperdimensional algorithms to replicate simple tasks that deep neural networks had begun tackling about two decades before, such as classifying images.

Consider an annotated data set that consists of images of handwritten digits. An algorithm analyzes the features of each image using some predetermined scheme. It then creates a hypervector for each image. Next, the algorithm adds the hypervectors for all images of zero to create a hypervector for the idea of zero. It then does the same for all digits, creating 10 “class” hypervectors, one for each digit.

Now the algorithm is given an unlabeled image. It creates a hypervector for this new image, then compares the hypervector against the stored class hypervectors. This comparison determines the digit that the new image is most similar to.

Yet this is just the beginning. The strengths of hyperdimensional computing lie in the ability to compose and decompose hypervectors for reasoning. The latest demonstration of this came in March, when Abbas Rahimi and colleagues at IBM Research in Zurich used hyperdimensional computing with neural networks to solve a classic problem in abstract visual reasoning — a significant challenge for typical ANNs, and even some humans. Known as Raven’s progressive matrices, the problem presents images of geometric objects in, say, a 3-by-3 grid. One position in the grid is blank. The subject must choose, from a set of candidate images, the image that best fits the blank.

“We said, ‘This is really … the killer example for visual abstract reasoning, let’s jump in,’” Rahimi said.

To solve the problem using hyperdimensional computing, the team first created a dictionary of hypervectors to represent the objects in each image; each hypervector in the dictionary represents an object and some combination of its attributes. The team then trained a neural network to examine an image and generate a bipolar hypervector — an element can be +1 or −1 — that’s as close as possible to some superposition of hypervectors in the dictionary; the generated hypervector thus contains information about all the objects and their attributes in the image. “You guide the neural network to a meaningful conceptual space,” Rahimi said.

Once the network has generated hypervectors for each of the context images and for each candidate for the blank slot, another algorithm analyzes the hypervectors to create probability distributions for the number of objects in each image, their size, and other characteristics. These probability distributions, which speak to the likely characteristics of both the context and candidate images, can be transformed into hypervectors, allowing the use of algebra to predict the most likely candidate image to fill the vacant slot.

Their approach was nearly 88% accurate on one set of problems, whereas neural network–only solutions were less than 61% accurate. The team also showed that, for 3-by-3 grids, their system was almost 250 times faster than a traditional method that uses rules of symbolic logic to reason, since that method must search through an enormous rulebook to determine the correct next step.

A Promising Start

Not only does hyperdimensional computing give us the power to solve problems symbolically, it also addresses some niggling issues of traditional computing. The performance of today’s computers degrades rapidly if errors caused by, say, a random bit flip (a 0 becomes 1 or vice versa) cannot be corrected by built-in error-correcting mechanisms. Moreover, these error-correcting mechanisms can impose a penalty on performance of up to 25%, said Xun Jiao, a computer scientist at Villanova University.

Hyperdimensional computing tolerates errors better, because even if a hypervector suffers significant numbers of random bit flips, it is still close to the original vector. This implies that any reasoning using these vectors is not meaningfully impacted in the face of errors. Jiao’s team has shown that these systems are at least 10 times more tolerant of hardware faults than traditional ANNs, which themselves are orders of magnitude more resilient than traditional computing architectures. “We can leverage all [that] resilience to design some efficient hardware,” Jiao said.

Another advantage of hyperdimensional computing is transparency: The algebra clearly tells you why the system chose the answer it did. The same is not true for traditional neural networks. Olshausen, Rahimi and others are developing hybrid systems in which neural networks map things in the physical world to hypervectors, and then hyperdimensional algebra takes over. “Things like analogical reasoning just fall in your lap,” Olshausen said. “This is what we should expect of any AI system. We should be able to understand it just like we understand an airplane or a television set.”

All of these benefits over traditional computing suggest that hyperdimensional computing is well suited for a new generation of extremely sturdy, low-power hardware. It’s also compatible with “in-memory computing systems,” which perform the computing on the same hardware that stores data (unlike existing von Neumann computers that inefficiently shuttle data between memory and the central processing unit). Some of these new devices can be analog, operating at very low voltages, making them energy-efficient but also prone to random noise. For von Neumann computing, this randomness is “the wall that you can’t go beyond,” Olshausen said. But with hyperdimensional computing, “you can just punch through it.”

Despite such advantages, hyperdimensional computing is still in its infancy. “There’s real potential here,” Fermüller said. But she points out that it still needs to be tested against real-world problems and at bigger scales, closer to the size of modern neural networks.

“For problems at scale, this needs very efficient hardware,” Rahimi said. “For example, how [do you] efficiently search over 1 billion items?”

All of this should come with time, Kanerva said. “There are other secrets [that] high-dimensional spaces hold,” he said. “I see this as the very beginning of time for computing with vectors.”

Uncategorized Source:https://www.quantamagazine.org/a-new-approach-to-computation-reimagines-artificial-intelligence-20230413/#comments

Leave a Reply

Your email address will not be published. Required fields are marked *