A brief history of AI (2/n): The neuron

A. The Biological neuron

An oversimplified biological neuron

A neuron is the basic unit of the human brain which is capable of transmitting electric signals to and receiving information from other neurons. It has the following parts:

  • Cell body: This is where the signals are received and processed.
  • Axon: This is what transmits the electric signal from the cell body to axon terminals.
  • Axon terminals: This is the place from where inter-neuron transmission takes place.

Strictly in terms of the human brain, there are many more details to the neuron. However, it is this receiving, processing, and transmitting of signals which has always been of utmost interest to the AI community. In the year 1943, Warren Sturgis McCulloch and Walter Pitts came up with the earliest model of an artificial neuron.

McCulloch-Pitts neuron model

The McCulloch-Pitts neuron model (aka Linear threshold gate model) is the first mathematical formulation of an artificial neuron that has acted as a foundation for all the future models of the artificial neuron. The neuron as per this model has three parts:

  • Inputs: I1, I2,….In are the n inputs to the neuron. These values can either be 0 or 1.
  • Aggregation function (g): A function that takes as input the inputs to the neuron.
  • Output function (f): A function that takes as input the value returned by the aggregation function and outputs either a 0 or 1.

A noteworthy point here is that the input(s) and output of the neuron are either 0 or 1. There are no intermediary states. The aggregation function is the weighted sum of the inputs to the neuron, i.e:

g(I1, I2,..In) = W1*I1 + W2*I2 + … Wn*In

The output function checks whether this aggregated value is greater than or equal to a particular threshold. If yes, it returns 1 else it returns 0.

f(X) = 1 if X ≥ T,

f(X) = 0 otherwise.

The threshold can also be termed as bias in this equation and the entire model can be written as the following algorithm:

  1. Calculate the weighted sum of the inputs and subtract the bias from this value.
  2. If this value is greater than or equal to 0, output 1. Else output 0.

The M-P neuron is a powerful model of the neuron and can be used to calculate logical functions like AND, OR, NAND, NOR among many others. However, the M-P neuron is a rather simple model as it does not allow inputs and outputs other than 0 and 1. It is also incapable of calculating linearly inseparable functions (like XOR). A new model of the neuron called the Perceptron model was introduced by Frank Rosenblatt in the year 1958 which overcomes some of the shortcomings of the M-P neuron.

Rosenblatt’s perceptron looks an awful lot like the M-P neuron model. This is because the perceptron is more or less identical to the M-P neuron with a few flexibilities added.

  • Inputs can take any real values and need not be restricted to 0 and 1.
  • The weights of the inputs can be learned automatically.
  • The threshold value can be learned automatically.

The automatic learning of the weights and threshold of the neuron is achieved using an algorithm called gradient descent. However, there are few problems that remain with Rosenblatt’s perceptron.

  • The output of the function is still 0 or 1. What this means is that the perceptron is just a single-layer binary classifier and can not be used for other advanced/complex problems.
  • The activation function as proposed in the original paper is the standard comparator as used in the M-P neuron. What this means is that perceptron remains a linear classifier and can not give satisfactory classification if the two classes are not separable using a hyperplane.

It is noteworthy that the limitations of the perceptron model can be solved using other activation functions like the ReLU, sigmoid, hyperbolic tangent among many others which introduce non-linearities in the output function of the neuron and makes the model capable of outputting values other than 0 and 1 as well. These neurons when used in different combinations are powerful enough to learn non-linear classifiers. In fact, this enhanced form of the perceptron is exactly what is used in the artificial neural networks as we know of them today.




“Stories about the creation of machines having human qualities have long been a fascinating province in the realm of science fiction. Yet we are about to witness the birth of such a machine — a machine capable of perceiving, recognizing, and identifying its surroundings without any human training or control.”

-Frank Rosenblatt




I hope this article helped you in learning something new. Please leave any feedback or suggestions in the comments below. I’ll be back with more articles for you guys and till then, keep rocking!

Living life one day at a time. Interested in math, computers, the internet, psychology and everything that comes in between. An obsessive and critical thinker.