Python For Machine Learning?
Formally, machine learning is a field of Computer Science that involves training a mathematical automata with a given dataset composed of numerical values. The English definition of this is using a model to quickly and accurately learn trends in data so you can make a generalization with similar data in the future. In this machine learning tutorial, we’ll what machine learning encompasses cover 5 machine-learning algorithms (or machine-learning models) to classify various images from the MNIST dataset.
Often, people use the terms ML and AI interchangeably. The difference between deep learning and machine learning comes down to the architecture used in your model. AI tend to have multiple hidden layers (making your model deep or otherwise thicc as the kids say) which ends up increasing the training time. For simpler patterns, machine learning is often a more elegant solution -leading to faster training times and surprisingly more accuracy than full scale neural network systems.
As for my experience, I have 3 years of academic experience on research topic in multimedia data with deep learning. It wasn’t until my first dedicated ML course I took that I began to really appreciate and implement these algorithms. This, of course, inevitably lead me to my first encounter with the MNIST dataset.
ML Basics
How Do I Get Started In Machine Learning?
Right here! There are several great walkthroughs online (including this one😉) that can walk you through several different techniques you can use to reduce the inaccuracy of your model. Let’s go over some brief concepts before we get too far into the code:
In General
ML usually starts off with a dataset. This dataset is a compilation of records or entries of a particular phenomena. In the case of MNIST, this dataset contains numerous handwritten digits ranging from 0-9. Since they are handwritten, they have some distortion or noise that prevents a traditional program from just hardcoding a general shape for each digit we would like to recognize. With that in mind, we need a machine learning solution or model or automata to find a relationship within the dataset.
In our machine learning projects, we have to see how effectively our model trained. The loss is the difference between our model’s output and the expected output values -respectively. This value is then fed back into the algorithm as the new lowest loss. On the next iteration, it will try to score even better. If it does, it will keep going. When the loss starts to flat line, the lowest value will be taken and that version of the model will be used for testing.
Your Environment
You should have Python installed and a good text editor to start programming. If not, these links will take you to some of our previous posts to get you covered. In short, we recommend Visual Studio Code because it is lightweight and can be modded to work very well with Python. Of course, you choose the right text editor that fits your needs best.
We can also use a module called matplotlib to generate a graph from our Python program with ease. This will be helpful to visualize our training and testing errors. Let’s also go over a few machine learning techniques we’ll be using:
k NN
Simply put, k Nearest Neighbors work to classify data by defining candidate samples for each group. Points falling within a short distance of this point (nearby neighbors) will be classified into this category. The data must be labeled, as kNN classification is a supervised learning algorithm
Regression
Linear regression is a simple technique that tries to find the average line that can be drawn between a sample of data points. Regression works well when data exhibits a linear trend; for example, all of the data points are increasing or decreasing at a linear rate. There are a few other spinoffs of regression that basically modify the function that is used to find the median of all data points in a graph.
Support Vector Machine (SVM)
SVMs employ a similar technique, however, they use a vector to find a distinct cut in between a dataset. This leaves us with a binary classification, so if we wanted to expand the number of items we can classify with an SVM, we simply use n-1, where n is the number of objects we wish to classify. You can think of it like this: each category can be defined as valid members and invalid members. The invalid members will make up (n-1)-1 categories, but that is outside the scope of this vector.
Decision Trees
As the name implies, a series of rules are generated and used to enforce a decision tree schema for selecting a label for each data. These schemas work efficiently, however, they do run into issues with input bias. Some features may impart heavier weights for the system, thus influencing the decision more than they should.
Random Forest
Random forests take the logic from the decision tree schema, but extend out the number of decision trees. To avoid bias, random features are given to each tree and voting will take place to decide which class to assign the data.
Naive Bayes
Simply put, Naive Bayes approaches assume that each feature should remain independent from another. For instance, a golf ball has features like a small diameter, dimples and a solid core. The Naive Bayes approach would consider anything could be a golf ball if it exhibits one or more of these features. This approach offers much flexibility with the dataset.
With that being said, let’s get coding!
Collecting the MNIST Dataset
There are a few places you can download the MNIST dataset. Kaggle is a machine-learning platform that also has a dedicated challenge with a reduced MNIST dataset which can be easier to work with. For this post, we’ll go over a few solutions with sklearn for this challenge.
Implementation of Algorithms
SKLearn will provide the backbone for our project. SKlearn offers quite a few built-in and fully optimized models we can utilize in our projects. We can implement our models using the following code:
kNN
The k-Nearest Neighbor algorithm works quite well for classifying tasks. It is a supervised algorithm, meaning that it will require labels for the data. The algorithm then reduces the number of points into a class or category. This category will be representative of the data within the class.
As the algorithm learns which center best represents the category, we can obtain better classification accuracy, as in this solution.
SVM
The Support Vector Machine is a classic algorithm that works to divide a given dataset into two partitions. The support vectors are maximized between this line, indicating a larger divide in between the groups of data.
See this solution to the MNIST challenge.
Experimentation
There are several machine learning and deep learning solutions to the MNIST. While any learning algorithm will generally be able to solve this problem, the best solutions are often the simplest. This way, it requires less computation time to produce an answer. However, you must ensure you have an obtainable accuracy with your system.
Learning new algorithms and neural network architectures will help increase the tools in your toolbox for solving fun challenges like the MNIST dataset, so be sure to try out new algorithms. The SKLearn module offers several models that are readily available and rapidly deployable -which will make experimenting with models a lot easier.
Conclusion
Python offers a great foundation for programming ML models using SKLearn. There are many different techniques you can learn to optimize this workflow, such as gathering more input data, parsing out noisy inputs and even feature engineering better features for your model. All in all, ML tasks require a solid foundation in the theory as well as the application.
Thank you for checking out our post! If you liked this article, be sure to share it. Let us know your thoughts on the post and we hope to see you again soon.

Founder and CEO of R U Coding Me LLC. Jacob obtained his Bachelor’s of Computer Science at the University of Central Florida. He likes to go to the gym and teach people about technology.