Solve the MNIST Image Classification Problem (2024)

The ‘hello world!’ of Deep Learning and Keras in under 10 minutes

Solving the MNIST is a benchmark for machine learning models. It is also one of the first problems that you’ll encounter as you enter the realm of deep learning.

In this article, we will design a neural network that uses the Python library Keras to learn to classify handwritten digits of the MNIST dataset.

The crazy thing about solving MNIST is that it isn’t resource-intensive at all. You do not need a GPU. The model we build here can be trained quickly on any modern CPU in under a minute.

Solve the MNIST Image Classification Problem (1)

The MNIST dataset is a set of 60,000 training images plus 10,000 test images, assembled by the National Institute of Standards and Technology (NIST) in the 1980s. These images are encoded as NumPy arrays, and the labels are an array of digits, ranging from 0 to 9. The images and labels have a one-to-one correspondence.

Assuming that you have already installed Anaconda, we shall briefly discuss installing Keras and other dependencies.

While Anaconda bundles a lot of good stuff by default, it does not ship with Keras pre-installed. The good folks over at Anaconda Cloud, however, have a customized Keras package available here.

You could install the same from the conda-forge channel using the command line.

$ conda install --channel conda-forge keras

Draw mappings of training samples and corresponding targets, i.e., load the data.
Create network architecture.
Select a loss function, an optimizer, and metrics to monitor during testing and training.
Pre-process the data as required.
Run the network on training samples, i.e., forward pass to get predictions.
Compute the loss of the network on the batch.
Update all weights of the network, reducing loss on the batch.

The MNIST Image Classification Problem is a Multiclass Classification problem. Let us implement the pedagogy, as discussed above.

The workflow adopted is simple. We’ll feed the training data to the neural network. The network will then learn to associate images and labels. Finally, the network will produce predictions for testing data and the degree of accuracy achieved.

Step 1: Loading the Data

The MNIST dataset comes preloaded in Keras, in the form of a set of four NumPy arrays. We’ll load the dataset using load_data() function.

Step 2: Network Architecture

We’ll build the network using densely connected (also called fully connected) sequential neural layers*. Layers are the core building block of a neural network. They are basic tensor operations that implement a progressive ‘data distillation.’

(confusing sentence ahead, read carefully**)

A layer takes the output of the layer before it as input.

**Therefore, the ‘shape of the input to a layer’ must match the ‘shape of the output of the previous layer.’
*Sequential layers dynamically adjust the shape of input to a layer based the out of the layer before it, thus reducing effort on the part of the programmer.

To create our network,

Our network consists of two fully connected layers. The second (and last) layer is a 10-way softmax layer, which returns an array of 10 probability scores. Each score is the probability that the current digit image belongs to one of our 10 digit classes.

Step 3: Compilation

Now let’s select an optimizer, a loss function & metrics for evaluation of model performance.

Step 4: Preprocessing the image data

We’ll pre-process the image data by reshaping it into the shape that the network expects and by scaling it so that all values are in the [0,1] interval.

Previously our training images were stored in an array of shape (60000, 28, 28) of type uint8 with values in the [0,255] range. We will transform it into a float32 array of shape (60000, 28 * 28) with values between 0 and 1.

We will then categorically encode the labels, using to_categorical function from keras.utils

Now, we are ready to train our network.

Step 5: Let’s Train

Keras trains the network via a call to the network’s fit() method. We fit the model to its training data.

As the network trains, you will see its accuracy increase and loss decrease. The times you see here are training times corresponding to an i7 CPU.

Epoch 1/5
469/469 [==============================] - 5s 10ms/step - 
loss: 0.2596 - accuracy: 0.9255
Epoch 2/5
469/469 [==============================] - 5s 10ms/step - 
loss: 0.1047 - accuracy: 0.9693
Epoch 3/5
469/469 [==============================] - 5s 11ms/step - 
loss: 0.0684 - accuracy: 0.9791
Epoch 4/5
469/469 [==============================] - 5s 10ms/step - 
loss: 0.0496 - accuracy: 0.9851
Epoch 5/5
469/469 [==============================] - 5s 11ms/step - 
loss: 0.0379 - accuracy: 0.9887

We have attained a training accuracy of 98.87% This measure tells us that our network correctly classified 98.87% of images in the testing data.

Step 6: Evaluating the network’s performance

The MNIST dataset reserves 10,000 images as test_images We will evaluate the performance of our network by testing it on this previously unseen dataset.

313/313 [==============================] - 1s 3ms/step - loss: 0.0774 - accuracy: 0.9775
Test Loss: 0.07742435485124588
Test Accuracy : 97.75000214576721 %

We attain a testing accuracy of 97.75% It is understandably lower compared to training accuracy due to overfitting.

To visualize our Neural network, we will need a couple of extra dependencies. You can install them using

$ $HOME/anaconda/bin/pip install graphviz$ pip3 install ann-visualizer

We’ll use ann-visualizer to visualize our neural network as a graph. This step is optional; you may choose not to visualize your network. However, looking at the graph is undoubtedly more fun.

graphviz is a dependency for ann-visualizer

Here is your neural network;

Solve the MNIST Image Classification Problem (2)

For more information on Visualizing Neural Networks check out the GitHub Repository of ann-visualizer that you just used.

Prodicode/ann-visualizer

A great visualization python library used to work with Keras. It uses python’s graphviz library to create a presentable…

github.com

Congratulations! You have successfully crossed the threshold of Deep Learning. If this was your first foray into Neural Networks, I hope you enjoyed it.

I recommend that you work along with the article. You can solve the MNIST image classification problem in under ten minutes. Make sure to stress enough on Pedagogy we discussed earlier. That is the basis of solving and implementing Neural Networks across the board.

I assume that the reader has a working understanding of technicalities like optimizer, categorical encoding, loss function, and metrics. You can find my practice notes on these concepts here.

For more, please check out the book Deep Learning with Python by Francois Chollet.

Feel free to check out this article’s implementation and more of my work on GitHub.

Thanks for reading!