Term of the Moment

first-party cookie


Look Up Another Term


Redirected from: neural net

Definition: neural network


A major AI architecture employed for many pattern recognition applications; however, one of the neural network's most popular uses is the creation of language models for ChatGPT, Gemini and other chatbots. Loosely based on the human nervous system, a computer-based neural network is technically an "artificial" neural network (ANN).

The neural network is used in image, language and speech recognition, text-to-speech conversion, robotics, diagnosing, forecasting and generative AI. Unlike regular applications that are programmed for precise results (if-then-else), large language models are "trained" on millions and billions of words and images. See AI secret sauce, AI programming, AI training, AI model and generative AI.

Layers and Nodes
A neural network is a "math machine" that learns from examples. It comprises multiple layers of computational units called "nodes" or "neurons" that are mathematically connected to each other (see image below). There is an input layer, an output layer and from a handful to thousands of "hidden" layers in between.

Servers and GPUs
Datacenters can have thousands of servers. The more servers, the larger the neural network and the more comprehensive the training. For smaller AI applications, a single desktop machine can contain a neural network; for example, see DGX Spark.

AI servers contain four to 16 GPUs, each with their own processor and memory; for example, NVIDIA's H100 contains eight GPUs (see H100). In a large datacenter, there can be tens of thousands of servers, and it can still take weeks and months to train huge language models. To support the neural networks of ever-larger models as well as reduce training time from months to days, it is estimated that a million or more GPUs are required. See GPU and AI training vs. inference.







A Single Node/Neuron
Nothing remotely like traditional "if-then-else" business logic, a neuron is a computational unit. Each layer contains a number of nodes/neurons and their mathematical values are computed with all the neurons in the next layer. The activation function determines the value that leaves the node and gets passed on. The weights and bias adjustment are modified in the training passes. See AI weights and biases.




Tracing a Neuron
The neural network is a pattern detection system. The words in sentences are turned into tokens and then into vectors, which are one-dimensional arrays, and the vectors are the data that move through the network. The neurons process each token, as well as the prior context, to predict the next one. Training language models means constantly predicting what comes next and evaluating that prediction for accuracy (see AI training passes).

The World's Information = One Giant Matrix
Essentially, AI companies have taken all the information ever published online and turned it into the world's largest mathematical matrix. See AI secret sauce.




A Small Network
This neural network has only 17 neurons/nodes in three hidden layers and might be used in a child's toy or edge device. (Image generated by ChatGPT.)






A Larger Network
This neural network has 8,192 neurons/nodes in eight hidden layers for a total of 65 thousand nodes and 4.3 billion connections. Now imagine a neural network with 500 billion nodes and you can begin to visualize the enormity and complexity of modern AI models. (Image generated by ChatGPT.)




There Are Many Network Designs
The following diagrams from the Asimov Institute in the Netherlands reveal the variety of neural network architectures. For a neural network that recognizes the alphabet and which is easier to understand than chatbot models, see convolutional neural network.

















Neural Network Architectures
AI networks are one of the most researched areas of computing in the 21st century. These examples reveal the many designs. (Images courtesy of Fjodor van Veen and Stefan Leijnen (2019). The Neural Network Zoo.)