Knowing a little about everything is often better than having one expert skill. This is particularly true for people entering the debate in emerging markets. Most notably, tech.
Most folks think they know a little about AI. But the field is so new and growing so fast that the current experts are breaking new ground daily. There is so much science to uncover that technologists and policymakers from other areas can contribute rapidly in the field of AI.
That’s where this article comes in. My aim was to create a short reference which will bring technically minded people up to speed quickly with AI terms, language and techniques. Hopefully, this text can be understood by most non-practitioners whilst serving as a reference to everybody.
Introduction
Artificial intelligence (AI), deep learning, and neural networks are terms used to describe powerful machine learning-based techniques which can solve many real-world problems.
While deductive reasoning, inference, and decision-making comparable to the human brain is a little way off, there have been many recent advances in AI techniques and associated algorithms. Particularly with the increasing availability of large data sets from which AI can learn.
The field of AI draws on many fields including mathematics, statistics, probability theory, physics, signal processing, machine learning, computer science, psychology, linguistics, and neuroscience. Issues surrounding the social responsibility and ethics of AI draw parallels with many branches of philosophy.
The motivation for advancing AI techniques further is that the solutions required to solve problems with many variables are incredibly complicated, difficult to understand and not easy to put together manually.
Increasingly, corporations, researchers and individuals are relying on machine learning to solve problems without requiring comprehensive programming instructions. This black box approach to problem-solving is critical. Human programmers are finding it increasingly complex and time-consuming to write algorithms required to model and solve data heavy problems. Even when we do construct a useful routine to process big data sets, it tends to be extremely complex, difficult to maintain and impossible to test adequately.
Modern machine learning and AI algorithms, along with properly considered and prepared training data, are able to do the programming for us.
Overview
Intelligence: the ability to perceive information, and retain it as knowledge to be applied towards adaptive behaviors within an environment or context.
This Wikipedia definition of intelligence can apply to both organic brains and machines. ==Intelligence does not imply consciousness, a common misconception proliferated by science fiction writers.==
Search for AI examples on the internet and you’ll see references to IBM’s Watson. A machine learning algorithm which was made famous by winning the TV quiz show Jeopardy in 2011. It has since been repurposed and used as a template for a diverse range of commercial applications. Apple, Amazon and Google are working hard to get a similar system in our homes and pockets.
Natural language processing and speech recognition were the first commercial applications of machine learning. Followed closely by other automated recognition tasks (pattern, text, audio, image, video, facial, …). The range of applications is exploding and includes autonomous vehicles, medical diagnoses, gaming, search engines, spam filtering, crime fighting, marketing, robotics, remote sensing, computer vision, transportation, music recognition, classification…
AI has become so embedded in the technology that we use, it is now not seen by many as ‘AI’ but just an extension of computing. Ask somebody on the street if they have AI on their phone and they will probably say no. But AI algorithms are embedded everywhere from predictive text to the autofocus system in the camera. The general view is that AI has yet to arrive. But it is here now and has been for some time.
AI is a fairly generalised term. The focus of most research is the slightly narrower field of artificial neural networks and deep learning.
How your brain works
The human brain is an exquisite carbon computer estimated to perform a billion billion calculations per second (1000 petaflops), while consuming around 20 Watts of power. The Chinese supercomputer, Tianhe-2 (as the time of writing the fastest in the world) manages only 33,860 trillion calculations per second (33.86 petaflops) and consumes 17600000 watts (17.6 megawatts). We have some way to go before our silicon creations catch up to evolutions carbon ones.
The precise mechanism that the brain uses to perform its thinking is up for debate and further study (I like the theory that the brain harnesses quantum effects, but that’s another article). However, the inner workings are often modelled around the concept of neurons and their networks. The brain is thought to contain around 100 billion neurons.
Neurons interact and communicate along pathways allowing messages to be passed around. The signals from individual neurons are weighted and combined before activating other neurons. This process of messages being passed around, combining and activating other neurons is repeated across layers. Across the 100 billion neurons in the human brain, the summation of this weighted combination of signals is complex. And that is a considerable understatement.
But it’s not that simple. Each neuron applies a function, or transformation, to its weighted inputs before testing if an activation threshold has been reached. This combination of factors can be linear or non-linear.
The initial input signals originate from a variety of sources… our senses, internal monitoring of bodily functions (blood oxygen level, stomach contents…). A single neuron may receive hundreds of thousands of input signals before deciding how to react.
Thinking or processing and the resultant instructions given to our muscles are the summations of input signals and feedback loops across many layers and cycles of the neural network. But the brain’s neural networks also change and update, including modifications to the amount of weighting applied between neurons. This is caused by learning and experience.
This model of the human brain has been used as a template to help replicate the brain’s capabilities inside a computer simulation… an artificial neural network.
Artificial Neural Networks (ANNs)
Artificial Neural Networks are mathematical models inspired by and modelled on biological neural networks. ANNs are able to model and process non-linear relationships between inputs and outputs. Adaptive weights between the artificial neurons are tuned by a learning algorithm that reads observed data with the goal of improving the output.
Optimization techniques are used to make the ANN solution to be as close as possible to the optimal solution. If the optimisation is successful the ANN is able to solve the particular problem with high performance.
An ANN is modelled using layers of neurons. The structure of these layers is known as the model’s architecture. Neurons are individual computational units able to receive inputs and apply a mathematical function to determine if messages are passed along.
In a simple three-layer model, the first layer is the input layer, followed by one hidden layer and an output layer. Each layer can contain one or more neurons.
As models become increasingly complex, with more layers and more neurons, their problem-solving capabilities increase. If the model is too large for the given problem, however, then the model cannot be optimised efficiently. This is known as overfitting.
The fundamental model architecture and tuning are the major elements of ANN techniques, along with the learning algorithms to read in the data. All the components bear the performance of the model.
Models tend to be characterized by an activation function. This is used to convert a neuron’s weighted input to its output activation. There is a selection of transformations that can be used as the activation function.
ANNs can be extremely powerful. However even though the mathematics of a few neurons is simple, the entire network scales up to become complex. Because of this ANNs are considered ‘black box’ algorithms. Choosing ANN as a tool to solve a problem should be done with care as it is not possible to unpick the system’s decision making process later.
Photo by Patrick Tomasso on Unsplash
Deep Learning
Deep learning is a term used to describe neural networks and related algorithms that consume raw data. The data is processed through the layers of the model to calculate a target output.
Unsupervised learning is where deep learning techniques excel. A properly configured ANN is able to automatically identify features in the input data important to achieve the desired output. Traditionally the burden of making sense of the input data usually falls to the programmer building the system. However, in deep learning setup, the model itself can identify how to interpret the data to achieve meaningful results. Once an optimised system has been trained the computational, memory and power requirements of the model is much reduced.
Put simply, feature learning algorithms allow a machine to learn for a specific task using well-suited data… the algorithms learn how to learn.
Deep learning has been applied to a wide variety of tasks and is considered one of the innovative AI techniques. There are well designed algorithms suitable for supervised, unsupervised and semi-supervised learning problems.
Shadow learning is a term used to describe a simpler form of deep learning where feature selection of the data requires upfront processing and more in-depth knowledge by the programmer. The resultant models can be more transparent and higher performance at the expense of increased time at the design stage.
Summary
AI is a powerful field of data processing and can yield complex results more quickly than traditional algorithm development by programmers. ANNs and deep learning techniques can solve a diverse set of difficult problems. The downside is that the optimised models created are black-box and impossible to unpick by their human creators. This can lead to ethical problems which data transparency is important.