Machine Learning 101: All the Basics You Need to Know

iStock-890115244.jpg

Introduction

In this blog post, we will discuss machine learning. We will answer a simple question, what is machine learning? Then we will describe the elements of machine learning along with some common terminology to understand. After that, we can discuss the machine learning process. Last, we will discuss some machine learning applications. This is truly an exciting field that everyone in business should have this elementary understanding of. Let’s get started!

What is Machine Learning?

Simply put, machine learning is a field in artificial intelligence. Some may be surprised to hear this, but artificial intelligence is not just one thing, rather it is a general term for a combination of different approaches and tools that is meant to mimic human intelligence. Machine learning can be thought of as the field that represents how humans learn from experience. Arguably, machine learning can be thought of as part of the brain in artificial intelligence. 


The way humans learn is from acquiring knowledge from their environment or being told knowledge from other people. The brain retains this knowledge and uses it in future decision making and interaction with the world. Machine learning works the same way. Machine learning takes data (experience) and relates it to an outcome. Then it learns from this data and is able to make predictions (decisions) on new data that is presented. This is different than programming as no instructions are ever provided to the model for its decision-making process. Rather it is learning the underlying pattern within the data and storing this model for future use. Think of it as reading a book then taking a test on the topics within the book. You are extrapolating what you learned in a book to answer unseen questions about the book.


The theoretical idea behind machine learning is that there is a single best mathematical function that best represents the pattern within the data. The goal of the data scientist is to try to figure out what this underlying theoretical function is by searching a large space of existing functions. The formula of linear regression from our blog post is an example of such a function De-mystifying Simple Linear Regression. This must be done because you will never know the underlying function of the problem that is being solved, but it can be approximated. There is a lot more theory behind machine learning, but that is for another post.


To sum up, machine learning is the process teaching a computer to learn a pattern from a given dataset and using that experience to make predictions (decisions) when given new data. 

Elements of Machine Learning

Machine learning is often referred to by different terms that are very similar, if not the exact same as machine learning. Predictive analytics, data mining, statistical learning and deep learning are just synonymous terms or types of machine learning. Now, let’s break down machine learning into a few pieces. 


There are two types of machine learning, supervised and unsupervised learning. 

Supervised Learning

First, supervised learning means we have a data set with input data and we know the output that we are trying to model. The output or dependent variable is typically referred to as Y. Now, we only know the output for our observed data set. If we knew all Y outputs, then we would not need machine learning. However, the model will learn what patterns in the data create the Y output and build the function to approximate this pattern. Then unseen data can be passed in and predicted. 


With supervised learning, we are trying to do one of two things. Either predict a continuous output (a number) or predict a class (non-numeric data). This is known as regression and classification. These terms refer to the type of the problem we are trying to solve based on the type of the output or Y. 

Regression

This term is often confusing as this encompasses models that are different than linear regression, which we are all familiar with. Basically, regression means we are trying to predict a number as the output or Y variable. This could be a price, temperature, weight or anything that has a numerical scale. While linear regression is not the only model when regression is our problem type, many regression models use linear regression or rather the assumption of linearity as their base. Again, this is outside the scope of this post. Ultimate takeaway is when trying to predict a number, a regression-based model is used. 

Classification

This is often referred to as pattern recognition. This means that we have at least two classes (often more) as our output and we are trying to find the pattern to probabilistically assign the label (Y) to the data. When we say class or label, think of groups or types of things. For example, a type of snake (poisonous or not-poisonous), the outcome of a marketing campaign (purchased our product or did not purchase our product), groups of temperatures (frozen, cold, neutral, hot, boiling). These categories can often be arbitrarily assigned or even represent human emotion or feeling. When classification is the problem type, the type of model used will be different as well as the output. Ultimate takeaway is when trying to predict groups or classes of data including words, a classification based model is used. 
To recap, supervised learning means we use the outcome or Y variable to train the model. We can have either a regression or classification type of problem to solve and each of those two problem types have different types of models that can be used. 

Unsupervised Learning

This is the opposite of supervised learning as in this case, we do not have a Y variable or output to use for training the model. This means that a whole different set of models must be used on the dataset. Often unsupervised learning is used as a way to generate classes or labels for a dataset that is then used for supervised learning. Since the output here is not known, the model finds the best way to group the data and build the labels. If not being used for supervised learning, this can be used to understand what types of categories exist in a dataset. This is often used in computer vision as well to build versions of simpler images by grouping colors. Ultimate takeaway is unsupervised learning is used to let the computer determine the best way to group observations in a dataset. 

Semi-Supervised Learning

Semi-supervised learning falls somewhere between supervised and unsupervised learning. This means you will have labels for some observations but not for all. Often what happens is it can be difficult or costly to build an entire training set for a supervised learning algorithm, so people will classify a small section of the data and use supervised learning to label the rest of the data before training. Often, only the most confident observations are kept and the uncertain examples discarded. This provides a larger training set that is not hand classified. It should be noted that this does build uncertainty into the model more so than hand labeling a dataset will. 

The Machine Learning Process

The machine learning process can be visualized as follows:

Machine Learning Process.jpg

First, the problem is defined. What are we trying to accomplish with machine learning? For example, can we predict which customers will churn from our subscription service based on prior behavior? The problem must be specific enough to gather data and solve. Then we gather training data. What variables are important or could be and what data exists? From there, do we know the output or not? If we do not know the output do we want to hand classify the data or use unsupervised learning? If we use unsupervised learning, then we run a variety of unsupervised models and review. We then decide if we want to use these to train a supervised model or we are done! For supervised models, we select models to train based on our problem and try to find the model that best approximates our output. We test to see which model achieves the best tradeoff between bias and variance (complexity vs non-complexity).  The best model will be the one that generalizes the best on unseen data, not the one that trains the best. Once, we have our model, we predict on new data! This is an oversimplification of the process, but it provides a high-level overview of what happens in the machine learning process. This is key to understand. 

What Can You do With Machine Learning?

There are so many applications in machine learning that they cannot all be listed. Here are a couple:

Marketing Applications

Machine learning fits nicely into the marketing research process and provides additional sophistication and accuracy to a marketing research problem. This could include applications such as predicting customer churn, lifetime revenue of a customer and predictive, targeted marketing. Individuality of consumers can be maintained and marketed using machine learning. Recommendation systems are a popular example of machine learning in marketing. Think about recommendations Netflix provides based on your prior watching history. 

Finance Fraud / IT Security

 Banks and the finance industry use a lot of machine learning to catch fraudulent transactions on a banking or credit card account. This is why sometimes your credit card is frozen and you get a call from your bank to validate charges. Information technology is in a similar situation. It uses machine learning to identify a hacker’s presence in their network or again to identify fraudulent transactions. Banks also use machine learning to vet loan applications based on customer demographics. This can often be controversial. 

Supply Chain / Manufacturing Applications

There is a lot of data that flows through a supply chain and manufacturing plants. In the supply chain, dates of arrival of freight can be predicted using weather patterns, carrier patterns and port capacity. Forecasting can also be enhanced using more advanced predictive models. In the manufacturing plant, quality defects can be predicted based on machine performance data. Also, machine breakdowns can be predicted so the machines can be scheduled for maintenance when they need it and avoid downtime. 

Process Automation

This is a generic catch-all, but basically any process that is repetitive or contains human decision points can be automated with machine learning. Machine learning is consuming a lot of manual work in marketing, accounting and project management functions. This is because machine learning can replicate a human and actually make more consistent decisions than a human can.

Conclusion

Machine learning is an extremely powerful tool and is transforming the way business is done. Usually when people refer to artificial intelligence, they usually mean machine learning. There is a lot of advanced math that goes into machine learning and takes time to understand. However, it is important everyone understands machine learning especially as it grows in popularity. This is so change can be embraced and the technology is not feared, but embraced and used to create more accurate and efficient operations. We will be expanding our posts on machine learning and concepts related to machine learning as we continue on our mission to change the ways companies operate and become data-driven. Stay tuned!