Another Dive Into Quantum

All this AI talk is cool and all, but how will it change in the future? What makes quantum computers so valuable in the world of artificial intelligence? You may or may not have asked these questions before, or maybe it’s just me. Regardless, I’m here to answer them!

So lets dive back into a major moving piece in machine learning. My previous blog talked about a function that calculated cost depending on how incorrect a computer was when identifying an image. Finding the maximum reduction in cost – the function’s global minimum – is a truly fundamental aspect when making a computer learn, so it’s absolutely vital that the global minimum is located.

Unfortunately, there are many implications when it comes to finding a function’s absolute minimum. The illustration below visually explains the complications.

Image result for neural network cost function local minimum

The function, J(W), represents an arbitrary cost function. Calculus is a great tool to use when locating local minimums by utilizing gradient descent; however, when a certain part of the function becomes flat, the slope is then evaluated as zero and the “optimal” cost value is falsely assumed. There are clear issues with the gradient decent strategy, so another strategy had to be created.

The second method used to evaluate absolute minimums in neural network cost functions is simulated annealing. This process uses a random map of points across the function landscape that reveals an abstract trace of how it looks. Another map is then used to outline a portion that appears to include the minimum value. These maps can be identified as temperatures where the initial map is ‘hot’ and the second map is ‘cold’. These are used in very specific environments and can take literally take an eternity to find the absolute minimum.

Then comes quantum computing. As mentioned before, quantum computers contain Q-Bits (Quantum Bits) with very peculiar attributes. These Q-Bits can be spin up, spin down, or in a superposition of both. When a Q-Bit particle is in a superposition, there is a 50% chance that the resulting spin is up or down (1 or 0), unless a magnetic field is used to alter this probability in a certain way. For quantum annealing, the 1 and 0 states represent two separate valleys in an energy diagram.

There is a metric crap ton of information that’s involved with this explanation, but to sum it up, a series of entangled Q-Bit particles can be mapped together with biases and couplings (terms for the type of connected two particles have) to build an energy landscape similar to the neural network cost function. The process of quantum annealing then locates the function’s absolute minimum in a VERY small amount of time.

A larger number of Q-Bits make quantum annealing exponentially quicker which helps neural networks learn at an incredible rate. This is a great example of how quantum computers and AI work in unison. I learned a great sum from the video below, so please take a look if you wish to gain a better understanding.