Machine learning and neural networks are crucial to our lives and the technology we use every day. But artificial intelligence and deep learning go beyond traditional neural networks, and if you’re interested in a career involving machine learning, it’s vital to understand all the facets that can be involved.
That’s where convolutional neural networks come in. Sometimes called ConvNets or CNNs, convolutional neural networks are a class of deep neural networks used in deep learning and machine learning. Convolutional neural networks are usually used for visual imagery, helping the computer identify and learn from images. They give the computer vision to help it see an input image, classify it, see patterns, and overall learn from it.
These networks are inspired by biological processes—as humans we begin using our eyes to identify objects from the time we are born. But computers don’t have this—when they see an image they see numbers. So CNNs help a computer have “human” eyes, giving them computer vision, and allowing it to take in all of the pixels and numbers it sees aiding in image recognition and image classification. The use activation functions to create a feature map, helping the computer understand what it is seeing. The feature map is passed from layer to layer, helping the computer gain more information each time until it is able to see the whole picture.
For example, as humans we can easily distinguish the difference between a cat and a dog. But a computer doesn't see things, it learns about an image from its characteristics. Trying to tell a computer to look for four paws or fur can be confusing—both cats and dogs have these features. So neural networks are used to detect as many characteristics as possible in order to help the computer classify an image. We tell the computer to look for edges, lines, all represented by pixel numbers. The computer is then able to combine all of the things it finds, look for patterns, and classify the image as either a cat or a dog. While this may seem like a lot of work, and it is, computers can do this incredibly quickly, making them more efficient than humans for image identification. These neural networks are also getting smarter and more sophisticated, with error rates going down every day. The error rate for this neural network is actually extremely important—in order for it to be effective we need to have low error rates that can justify not having humans do the classification manually. Because these neural networks can learn, they are also able to learn from their mistakes and improve, helping error rates get lower every time.
Convolutional neural networks get their name from a mathematical operation called convolution. This is a specialized kind of linear operation, and CNNs use this mathematical operation instead of matrix multiplication in at least one of the layers. This is what separates ConvNets from other neural networks in deep learning.
A convolutional neural network layer has to have these elements:
Convolutional kernels (filters) that are defined by width and height. These kernels go over the image, one unit or pixel at a time, to get the pixel value so it can be added to the matrix.
Input channels and output channels and input volumes and output volumes. These input and output channels are where the feature maps go in and out of the different layers. This helps the information that the neural network has learned be passed on to the next layer where more matrix and analysis can be done.
The depth of the convolution filter needs to be equal to the number of channels of the input feature map. This means that however many times a filter is applied to an image, that is how deep the end result feature map should be. This is because each time a filter passes over the image or feature map, a new layer is added.
While it may seem a bit complex, convolutional neural networks are fascinating and are extremely valuable in different technology fields. There is quite a bit of complicated math involved in these neural networks, so better understanding of them can be achieved by learning more about linear algebra and how it functions. If you find this information interesting, a degree in IT and eventual career in machine learning may be an ideal fit for you.
In this guide you’ll learn the basics of CNNs, but if you are interested in learning more, a degree program may be the best way to continue your learning.
The basics of convolutional neural networks.
A computer doesn’t have eyes, so convolutional neural networks are the best way to give a computer the ability to “look” at an image. It uses a matrix of pixel values to map what the input image is, then it is able to offer image recognition and image classification based on the numbers it sees. CNNs “see” an input image as a matrix of numbers because that is the only language they understand.
MLPs and R-CNNs can also be used to help identify and classify images, but ConvNets use a shortcut to help point out elements that make an image different, or points out areas that don’t follow the pattern. They utilize feature maps to help make the steps simpler for the entire process. CNNs are ideal for deep learning that involves lots of data or images, because the entire point of convolutions is to speed up the process. Instead of having to look at millions of images to learn, CNNs can be faster at deep learning because they apply different mathematical principles to understanding what an image is and what it isn't.
Due to the way convolutional neural networks map data, they are often used in image and video recognition. CNNs are used for predictive analysis and can streamline information without losing details in large datasets. This is extremely valuable for professionals using this network, as large and complex images and videos don’t have to lose detail when going through a CNN. MLPs and RNNs can also be used for image mapping, but aren’t as equipped to deal with large datasets and small details. The very math that CNNs use make them an ideal choice for large data sets, because they skip some of the more complex and time consuming mathematical elements and replace them with unique formulas and functions. This helps the entire process go faster, without losing elements of these larger data sets. This is why CONV layers and networks are ideal for deep learning requests.
Applications of CNNs.
There are many uses for CNNs in the technology we have all around us. Some common examples include:
Facial recognition. Computers are able to recognize and identify people using CNNs. They identify faces in the picture, learn how to focus on the face despite lighting or poses, identify unique features, and compare the data they collect with a name. CNNs help your phone or Facebook be able to identify who they are looking at every time.
Document analysis. ConvNets are used to identify handwriting, compare it to a database of handwriting, understand what the written words are, and more. They can read handwritten documents which is important for banking and finances, or classify documents for museums.
Genetics. CNNs can look at images of cells and use mapping and predictive analytics to help medical experts learn about new treatments and possibilities.
Satellite images. CONV layers and networks can be used to help classify satellite images, separating them into categories and identifying what they are so humans don't have to spend the time doing it. Computers can do this much faster than humans can.
Convolutional neural networks are used in healthcare, museums, technology, social networking, e-commerce, and many other areas to help classify and sort images.
Convolutional neural network architecture.
Convolutional neural networks are fairly complex in their build and process, but there is a basic system to how they are structured and how they run. There are many layers involved in CNNs, including hidden layers. All of these are important in helping the computer create a feature map of the image, and understand how to classify what it is looking at.
Convolutional layer. The convolutional layer is the top layer, after the input layer, and is the layer focused on math. This is where the computer will work to understand the number pattern that it sees. This CONV layer uses a filter, also known as a neuron or a kernel, that reads part of the image and assigns it a number (after doing some complicated math to get to that number). The filter will do this for every unit or pixel of the image, storing the data as it goes. Convolutional layers are hidden because their inputs and outputs just connect to each other, not giving any information to us in an actual output. Each of these layers uses an activation function and a filter to create a feature map—basically each layer draws what information it can and then uses an activation function to create a feature map, passing that information on to the next layer.
Pooling layer. This layer is used to stream the underlying computation of the process. The reduce the dimensions of the data, combining the outputs of neurons at a single layer, into one simple neuron for the next layer. Local pooling will combine small clusters of neurons, usually 2x2. Global pooling will work on all the neurons in a convolutional layer. This is how the process is able to be simplified so the computer can work with it. This layer still uses feature mapping as the outputs and passes on the information to the next layer.
Fully connected layer. The fully connected layer is the final layer of convolutional networks. In the fully connected layer, every neuron is connected to a neuron in another layer. It is the same principle as multilayer perceptron neural networks (MLPs). This is where the classification of the image will actually happen.
Getting started with convolutional neural networks.
Convolutional neural networks are a complex and very important element of AI and technological advances. There are some valuable ways you can get started in learning about neural networks and their many complexities. Some of the best ways to get started are:
Learn about the math. There are extremely complicated mathematical principles of linear algebra involved in neural networks. If you find this topic interesting, it can be hugely beneficial to start learning about linear algebra, convulsions, and other math principles.
Observe. Spend time observing convolutional neural networks at work around you. Go on Facebook and see how it can help you tag friends in a photo through facial recognition. Try a historical record software that will help you read handwritten documents. Use Google image search to find similar images. All of these processes are using CNNs to help find and classify images for you. It's fascinating once you understand that these neural networks are working hard behind the scenes all around you.
Get a degree. A bachelor’s degree in computer science or something similar can help you start to understand how computers work and speak, preparing you to better understand advanced AI. A degree program will help you learn about scripting, programming languages, upcoming technology, and all of the elements of AI and machine learning that can help you get started on the path to an exciting career.
Research. There are many articles and explanations out there to help you learn about convolutional neural networks. If you find this topic interesting, a degree and additional research can help you learn more and be prepared for an exciting future, and even a career, involving neural networks. Podcasts, articles, YouTube videos, and even entire conferences and workshops exist to help you learn more about deep learning and AI.
If you're interested in convolutional neural networks and deep learning, get started expanding your knowledge by getting a degree. WGU is an ideal place to pursue a computer science degree because you can work entirely on your schedule. You don't have to log into classes, you don't have assignment due dates. You are really in charge of your learning. This means you can go quickly through material that you already know or pick up faster, and slow down if there are areas you need additional help.