The Matrix

If your brain hurts from the previous post, make sure to get some caffeine for this one. We’re about to dive into the logistics of how neurons are interconnected and how a set of criteria can chisel down an image to identify a given symbol.

I hope that the Neural Structures post provided a general idea of what a neural network looks like. To prepare for this one, think back to when you first learned about mathematical functions.

y = f(x) = ax + b (some arbitrary equation)

This essentially is the way that each neural node is connected. Now imagine a string of neurons connected to a second layer of neurons via these arbitrary functions. You can begin to understand why these neural networks start looking the way they do.



Seems simple enough, right? Well, to properly compartmentalize an image, these functions need to do something very specific. For instance, in the process of identifying a symbol, edges are extremely important. The second and third layers of neurons need to fire if a certain edge or curve was identified. The functions are what shape the criteria for a neuron to fire.

To build one of these functions, weights are required to establish a rough draft blueprint of how an edge may look. The neuron values from an input image is extracted and put into a summing function like so:

f(x) = a1 + a2 + a3 + a4 + …. + an
(Where n is the number of pixels)

An introduction of the previously mentioned weights would make the function look like this:

f(x) = wa1 + wa2 + wa3 + wa4 + wan

The function has the largest sum when the number of identified pixels is greatest; the associated edge has been located. Now, to simplify the output and ensure that the sum is controlled, a Sigmoid function is used to limit the sum range from 0 to 1. Thus the function looks like this:

Additional weights and biases can be used in the function as well to increase/decrease how diligently the values are checked. But for now, the function with the greatest sum will yield the greatest grey scale for the neurons in the next layer.

Let’s say that an image has 784 pixels (neurons) and the second layer contains 16 neurons. Each second layer neuron must have it’s own function with it’s own weights that takes in the 784 first layer neurons as parameters. If a third layer of 16 neurons was also in use, there would be a total of 12,960 different weights that could be manipulated to better identify images.

The act of “machine learning” is getting the machine to dial in weights and biases for the optimal amount of image recognition. Pretty cool stuff! Now, before I conclude this post, I wanted to explain the notation that neural networks are mathematically represented. Below is a matrix of weights and input layer neurons. Looks pretty wild!

I hope that you now understand the structure of neural networks and have a general understanding of how neuron layers interact. Next post will go over how machines actually learn which is super exciting stuff!