How to create image recognition with Python?
Image recognition is one of the most widespread machine learning classes of problems. It aims at training machines to recognize images similarly as people do.
Image recognition belongs to the group of supervised learning problems, i.e., classification problems, to be more precise.
This article presents a relatively simple approach of training a neural network to recognize digits. This approach uses an ordinary feedforward neural network. The accuracy of the model can be further improved using other techniques.
Creating the Basic Model
When creating the basic model, you should do at least the following five things:
1. Import modules, classes, and functions. In this article, we’re going to use the Keras library to handle the neural network and scikit-learn to get and prepare data.
2. Load data. This article shows how to recognize the digits written by hand. The function load_digits() from sklearn.datasets provide 1797 observations. Each observation has 64 features representing the pixels of 1797 pictures 8 px high and 8 px wide. Each feature can be in the range 0–16 depending on the shade of grey it has. The outputs represent correct digits and can have integer values in the range 0–9.
3. Transform and split data. We first need to binarize the outputs, i.e., make each of them a vector with the values 0 and 1. Then, we have to split the entire dataset to training and test sets. Finally, we standardize the inputs.
4. Create the classification model and train (fit). The simplest models have one input layer that is not explicitly added, one hidden layer, and one output layer. We use a training set to train our neural network.
5. Test the classification model. Finally, we test the performance of the network using the test set.
This is how the code looks like:
As you can see, the accuracy of the model is about 97.8 %. The results might vary!
You can play with the hyper-parameters and change the number of units in the hidden layer, the optimizer, number of epochs of training, the size of batches and so on, trying to further improve the accuracy of the network.
Making the Network Deeper
Deep neural networks have more than one hidden layer. Adding hidden layers might improve the accuracy. The code is almost the same in the previous case, just with one additional statement to add another hidden layer:
The accuracy is slightly increased to 98.3 %.
Convolutional Neural Networks and Other Improvements
Image recognition problems are often solved with even higher accuracy than we’ve obtained here. One way to improve the networks for image recognition is by adding convolutional and pooling layer, making a convolutional neural network.
Additionally, some sort of regularization can be used, like a dropout.
For more information on how to do this with Keras, you can take a look at the official Keras documentation.
This article is an introduction in implementing image recognition with Python and its machine learning libraries Keras and scikit-learn. Image recognition is supervised learning, i.e., classification task.
This is just the beginning, and there are many techniques to improve the accuracy of the presented classification model.