Artificial neurons

An artificial neuron (also called a "node") is a basic unit in an artificial neural network. Artificial neurons are simulations of biological neurons, and they are typically functions from many dimensions to one dimension. They receive one or more inputs and sum them to produce an output. Usually the sums of each node are weighted, and the sum is passed through a non-linear function known as an activation or transfer function. The canonical form of transfer functions is the sigmoid, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. Generally, transfer functions are monotonically increasing.

Basic structure
For a given artificial neuron, let there be m inputs with signals x1 through xm and weights w1 through wm.

The output of neuron k is:


 * $$y_k = \varphi( \sum_{j=0}^m w_{kj} x_j)$$

Where $$\varphi$$ (Phi) is the transfer function.



The output propagates to the next layer (through a weighted synapse) or finally exits the system as part or all of the output.

History

 * The original artificial neuron is the Threshold Logic Unit first proposed by Warren McCulloch and Walter Pitts in 1943. As a transfer function, it employs a threshold or step function taking on the values '1' or `0' only.


 * Perceptron

Types of transfer functions
The transfer function of a neuron is chosen to have a number of properties which either enhance or simplify the network containing the neuron. Crucially, for instance, any multi-layer perceptron using a linear transfer function has an equivalent single-layer network; a non-linear function is therefore necessary to gain the advantages of a multi-layer network.

Step function
The output y of this transfer function is binary, depending on whether the input meets a specified threshold, &theta;. The "signal" is sent, i.e. the output is set to one, if the activation meets the threshold.


 * $$y = \left\{ \begin{matrix} 1 & \mbox{if }u \ge \theta \\ 0 & \mbox{if }u < \theta \end{matrix} \right.$$

See: Step function

Sigmoid
A fairly simple non-linear function, the sigmoid also has an easily calculated derivative, which is used when calculating the weight updates in the network. It thus makes the network more easily manipulable mathematically, and was attractive to early computer scientists who needed to minimise the computational load of their simulations.

See: Sigmoid function