Questions about Machine Learning
Before diving into Python details, let's consider some important questions: "What is Machine Learning" and "Why is it important?"
If you haven't read the first in this series of articles on building enterprise Machine Learning applications with WebFOCUS and Python, you can find it here: Part I.
Before diving into Python details, let's consider some important questions: "What is Machine Learning" and "Why is it important?"
Traditional Software Programming
Throughout most of our history of data processing, application developers have given computing machines explicit instructions on what to do and how to do it. These computers followed step-by-step commands without questioning the developers, a blind obedience that often resulted in "bugs" due to human error. For decades, we have "programmed" classic software applications by typing out commands on a keyboard.
Today is a turning point. With more robust computers with sophisticated capabilities, a developer can now provide a machine with pre-defined algorithms and training data (where the answers to the questions are already given). The machine can test the different algorithms to see which provides the most right answers. Using this as a model, the machine can then move on to new data that doesn't already have the answers given and make predictions.
Supervised and Unsupervised Machine Learning
A classic example comes from a 1936 paper describing fifty observations of three types of Iris flowers. A gentleman named Edgar Anderson spent some time measuring and documenting the lengths and widths of both the petals and sepals of 150 different Iris flowers evenly grouped into three different species. He labeled these observations as being from an Iris setosa, Iris virginica, or Iris versicolor.
Frank Mayfield photo: Iris virginica shrevei BLUE FLAG |
After analyzing Anderson's Iris details, a statistician named Ronald Fisher wrote a paper on how we could distinguish these three Iris species based on the relationships between petal and sepal measurements. In other words, you could classify the flower without being an expert who could make a visual identification. In 1936, Fisher used a pencil and paper to compute this classification work but today we can use Python or R for what would be considered "supervised machine learning."
We have our choice of a variety of techniques for supervised machine learning, including: classifications, regression (predict a real value), decision trees (which can be used for classifications and regression), and dimension reduction (figuring out the important parameters and ignoring the others). You see these types of machine learning in a range of industry uses such as: credit ratings, spam detection, predicted values, sales prospecting, employee churn, and so forth.
A well-known example of supervised machine learning comes from Target Stores leveraging their baby registers to build a model of the buying habits of pregnant women. After building a predictive model from that labeled training data, Target software could search through credit card purchases to predict which Redcard holders might be pregnant and proactively reach out to them with recommendations for diapers, baby clothing, formula, and so forth. The shocker here is that Target could send coupons to a lady prior to her announcing the pregnancy to her family.
Our next step in the evolution of machine learning would naturally go from humans "supervising" machines and providing them with training material to letting them go unsupervised with unlabeled data. Let's see what machines can do without our assistance. The machine is given freedom to look for hidden structures, patterns, and relationships within the data. We humans don't really know what the outcome will be but have our fingers crossed it will be useful (perhaps profitable, beneficial to society, or at least insightful).
In unsupervised machine learning, the computer is not able to gauge the accuracy of its predictions. This still takes a human to evaluate the results. Some solutions might not even be clear-cut between supervised and unsupervised and fall into a category of "semi-supervised" machine learning where some but not all data labels are provided.
Reinforced and Deep Learning
A further advancement in machine learning would be "reinforcement" learning. This borrows some theories from human psychology and tries to get machines to understand the concept of a "reward," providing incentives to maximize that goal. To do this, the machine may need to break down the problem into many sub-problems, store each solution, and later reuse those answers as part of larger problems (considered "dynamic programming" techniques or "dynamic optimization"). This is still unsupervised learning, so the machine doesn't know right or wrong but instead focuses on its performance, balancing exploring new information versus maximizing what it already knows.
Since we are trying to make the machine behave like a human motivated by rewards, we might as well just go all-in and try to simulate the human brain. Here we are in the realm of "deep" or "hierarchical" learning where machines work less with task-specific algorithms and rely more on interpreting and transmitting information like we would see in brain neurons. Use cases of deep learning are in the news: machine vision (e.g., face recognitions, classifications of animals, and self-driving cars not running into people or things), speech recognition (think of your favorite voice-activated personal agent: Siri, Cortana, Alexa, Google, etc.), and natural language processing (e.g., chatbots).
Many experts believe that those individuals or companies who tackle machine learning first will go on to become extremely rich, famous, and perhaps powerful.
You can also find many people giving dire warnings about the progress being made so that machines can learn like humans. Learning is one thing; machines are not necessarily yet acting currently on their decisions but this could change in the near future. As we become comfortable with machines making decisions for use and acting on our behalf (e.g., paying your bills, choosing the best flight and hotel for your trip, ordering food for your fridge, making stock trades in your name, or modifying your retirement portfolio), we are sure to give them more freedom. With some imagination, we have to wonder what if machines determine the best solution to a problem is to eliminate us humans and then they follow up with an action plan?