Artificial intelligence is the foundation of self-driving cars, drones, robotics, and many other frontiers in the 21st century. Hardware-based acceleration is essential for these and other AI-powered solutions to do their jobs effectively.
Specialized hardware platforms are the future of AI, machine learning (ML), and deep learning at every tier and for every task in the cloud-to-edge world in which we live.
Without AI-optimized chipsets, applications such as multifactor authentication, computer vision, facial recognition, speech recognition, natural language processing, digital assistants, and so on would be painfully slow, perhaps useless. The AI market requires hardware accelerators both for in-production AI applications and for the R&D community that’s still working out the underlying simulators, algorithms, and circuitry optimization tasks needed to drive advances in the cognitive computing substrate upon which all higher-level applications depend.
Different chip architectures for different AI challenges
The dominant AI chip architectures include graphics processing units, tensor processing units, central processing units, field programmable gate arrays, and application-specific integrated circuits.
However, there’s no “one size fits all” chip that can do justice to the wide range of use cases and phenomenal advances in the field of AI. Likewise, no one hardware substrate can suffice for both production use cases of AI and for the varied research requirements in the development of newer AI approaches and computing substrates. For example, see my recent article on how researchers are using quantum computing platforms both for practical ML applications and development of sophisticated new quantum architectures to process a wide range of sophisticated AI workloads.
Trying to do justice to this wide range of emerging requirements, vendors of AI-accelerator chipsets face significant challenges when building out comprehensive product portfolios. To drive the AI revolution forward, their solution portfolios must be able to do the following:
- Execute AI models in multitier architectures that span edge devices, hub/gateway nodes, and cloud tiers.
- Process real-time local AI inferencing, adaptive local learning, and federated training workloads when deployed on edge devices.
- Combine various AI-accelerator chipset architectures into integrated systems that play together seamlessly from cloud to edge and within each node.
Neuromorphic chip architectures have started to come to AI market
As the hardware-accelerator market grows, we’re seeing neuromorphic chip architectures trickle onto the scene.
Neuromorphic designs mimic the central nervous system’s information processing architecture. Neuromorphic hardware doesn’t replace GPUs, CPUs, ASICs, and other AI-accelerator chip architectures, neuromorphic architectures. Instead, they supplement other hardware platforms so that each can process the specialized AI workloads for which they were designed.
Within the universe of AI-optimized chip architectures, what sets neuromorphic approaches apart is their ability to use intricately connected hardware circuits to excel at such sophisticated cognitive-computing and operations research tasks that involve the following:
- Constraint satisfaction: the process of finding the values associated with a given set of variables that must satisfy a set of constraints or conditions.
- Shortest-path search: the process of finding a path between two nodes in a graph such that the sum of the weights of its constituent edges is minimized.
- Dynamic mathematical optimization: the process of maximizing or minimizing a function by systematically choosing input values from within an allowed set and computing the value of the function.
At the circuitry level, the hallmark of many neuromorphic architectures -- including IBM’s -- is asynchronous spiking neural networks. Unlike traditional artificial neural networks, spiking neural networks don’t require neurons to fire in each backpropagation cycle of the algorithm, but, rather, only when what’s known as a neuron’s “membrane potential” crosses a specific threshold. Inspired by a well-established biological law governing electrical interactions amongst cells, this causes a specific neuron to fire, thereby triggering transmission of a signal to connected neurons. This, in turn, causes a cascading sequence of changes to the connected neurons’ various membrane potentials.
Intel’s neuromorphic chip is foundation of its AI acceleration portfolio
Intel has also been a pioneering vendor in the still embryonic neuromorphic hardware segment.
Announced in September 2017, Loihi is Intel’s self-learning neuromorphic chip for training and inferencing workloads at the edge and also in the cloud. Intel designed Loihi to speed parallel computations that are self-optimizing, event-driven, and fine-grained. Each Loihi chip is highly power-efficient and scalable. Each contains over 2 billion transistors, 130,000 artificial neurons, and 130 million synapses, as well as three cores that specialize in orchestrating firings across neurons.
The core of Loihi’s smarts is a programmable microcode engine for on-chip training of models that incorporate asynchronous spiking neural networks. When embedded in edge devices, each deployed Loihi chip can adapt in real time to data-driven algorithmic insights that are automatically gleaned from environmental data, rather than rely on updates in the form of trained models being sent down from the cloud.
Loihi sits at the heart of Intel’s growing ecosystem
Loihi is far more than a chip architecture. It is the foundation for a growing toolchain and ecosystem of Intel-development hardware and software for building an AI-optimized platform that can be deployed anywhere from cloud-to-edge, including in labs doing basic AI R&D.
Bear in mind that the Loihi toolchain primarily serves those developers who are finely optimizing edge devices to perform high-performance AI functions. The toolchain comprises a Python API, a compiler, and a set of runtime libraries for building and executing spiking neural networks on Loihi-based hardware. These tools enable edge-device developers to create and embed graphs of neurons and synapses with custom spiking neural network configurations. These configurations can optimize such spiking neural network metrics as decay time, synaptic weight, and spiking thresholds on the target devices. They can also support creation of custom learning rules to drive spiking neural network simulations during the development stage.
But Intel isn’t content simply to provide the underlying Loihi chip and development tools that are primarily geared to the needs of device developers seeking to embed high-performance AI. The vendors have continued to expand its broader Loihi-based hardware product portfolio to provide complete systems optimized for higher-level AI workloads.
In March 2018, the company established the Intel Neuromorphic Research Community (INRC) to develop neuromorphic algorithms, software and applications. A key milestone in this group’s work was Intel’s December 2018 announcement of Kapoho Bay, which is Intel’s smallest neuromorphic system. Kapoho Bay provides a USB interface so that Loihi can access peripherals. Using tens of milliwatts of power, it incorporates two Loihi chips with 262,000 neurons. It has been optimized to recognize gestures in real time, read braille using novel artificial skin, orient direction using learned visual landmarks, and learn new odor patterns.
Then in July 2019, Intel launched Pohoiki Beach, an 8 million-neuron neuromorphic system comprising 64 Loihi chips. Intel designed Pohoiki Beach to facilitate research being performed by its own researchers as well as those in partners such as IBM and HP, as well as academic researchers at MIT, Purdue, Stanford, and elsewhere. The system supports research into techniques for scaling up AI algorithms such as sparse coding, simultaneous localization and mapping, and path planning. It is also an enabler for development of AI-optimized supercomputers an order of magnitude more powerful than those available today.
But the most significant milestone in Intel’s neuromorphic computing strategy came last month, when it announced general readiness of its new Pohoiki Springs, which was announced around the same that Pohoiki Beach was launched. This new Loihi-based system builds on the Pohoiki Beach architecture to deliver greater scale, performance, and efficiency on neuromorphic workloads. It is about the size of five standard servers. It incorporates 768 Loihi chips and 100 million neurons spread across 24 Arria10 FPGA Nahuku expansion boards.
The new system is, like its predecessor, designed to scale up neuromorphic R&D. To that end, Pohoiki Springs is focused on neuromorphic research and is not intended to be deployed directly into AI applications. It is now available to members of the Intel Neuromorphic Research Community via the cloud using Intel’s Nx SDK. Intel also provides a tool for researchers using the system to develop and characterize new neuro-inspired algorithms for real-time processing, problem-solving, adaptation, and learning.
The hardware manufacturer that has made the furthest strides in developing neuromorphic architectures is Intel. The vendor introduced its flagship neuromorphic chip, Loihi, almost 3 years ago and is already well into building out a substantial hardware solution portfolio around this core component. By contrast, other neuromorphic vendors -- most notably IBM, HP, and BrainChip -- have barely emerged from the lab with their respective offerings.
Indeed, a fair amount of neuromorphic R&D is still being conducted at research universities and institutes worldwide, rather than by tech vendors. And none of the vendors mentioned, including Intel, has really begun to commercialize their neuromorphic offerings to any great degree. That’s why I believe neuromorphic hardware architectures, such as Intel Loihi, will not truly compete with GPUs, TPUs, CPUs, FPGAs, and ASICs for the volume opportunities in the cloud-to-edge AI market.
If neuromorphic hardware platforms are to gain any significant share in the AI hardware accelerator market, it will probably be for specialized event-driven workloads in which asynchronous spiking neural networks have an advantage. Intel hasn’t indicated whether it plans to follow the new research-focused Pohoiki Springs with a production-grade Loihi-based unit for production enterprise deployment.
But, if it does, this AI-acceleration hardware would be suitable for edge environments where event-based sensors require event-driven, real-time, fast inferencing with low power consumption and adaptive local on-chip learning. That’s where the research shows that spiking neural networks shine.