Graphics from Clay Banks – https://unsplash.com/@claybanks

Introduction to Supervised Learning

Supervised Learning represents one of the four topics of Machine Learning. This post is intended to provide an introduction to the topic.

Henrik Bartsch

Henrik Bartsch

Classification

Machine Learning has generated quite a bit of buzz in recent years. Machine Learning has been seen as having the potential for individuals and companies to improve procedures or predictions, even beyond the level that humans can achieve. The subject area most commonly used in this regard: Supervised Learning, a basic tool of Machine Learning’s. This post describes the main features of Supervised Learning and gives a rough overview of issues and application areas.

Introduction to Supervised Learning.

Supervised Learning (also known as Supervised Machine Learning) is about generating models that generate a precise mapping between known input information and known output information. By giving this input information, the model is gradually adjusted iteratively to perform better on the data. This process is performed until certain conditions or metrics are met. 1 2

How Supervised Learning works.

In supervised learning, we use a data set for training. This is generally divided into two parts: A training dataset and a testing dataset. The training dataset is used to update the weights or parameters of the model and the test dataset is used to check the generalization of the model and test its real-world applicability. In the training process, this involves minimizing a loss function that measures the deviation between the actual and predicted output information. 3

In supervised learning there are basically two different types of problems, which can be handled with this:

  1. Classification: input information should be used to assign the information set to a certain class. The model should try to find correlations or definitions that best fit each class. Examples of such algorithms are Linear Classifier, Support Vector Machines, Decision Trees, k-Nearest Neighbor, or Random Forest.

  2. Regression: Regression problems attempt to explain the relationship between dependent and independent variables. A regression model is designed to predict how certain values will evolve in specific situations or at specific points in time. Examples for such algorithms are Linear Regression, Logistic Regression, Polynomial Regression.

In principle, however, all algorithms can be represented by artificial neural networks. This is particularly useful for complex regression problems, since the algorithms mentioned above are usually limited in their accuracy.

Advantages of Supervised Learning

There are a number of arguments which speak for the use of supervised learning: 4

  1. learning experience feeds into the process (data set),
  2. excellent for predictions,
  3. Can generate recommendations,
  4. ease of implementing the learning process.

Disadvantages of Supervised Learning

For the training of supervised learning algorithms, a number of prerequisites must be fulfilled, which can have a detrimental effect: 3 5

  1. models require a certain level of expertise to produce meaningful results,
  2. training can be time intensive,
  3. datasets can have higher error rates than human error, which can result in erroneous learning behavior,
  4. supervised learning methods cannot perform classification or clustering on their own.
  5. data from the dataset should be as heterogeneous as possible and have as much variance as possible to provide good results accordingly,
  6. data preparation for appropriate algorithms can be complex.

Applications of Supervised Learning

In supervised learning, there are basically two different application areas in which the algorithms are frequently applied.

Classification

The purpose of a classification is to make a prediction about an affiliation from input data. This affiliation is usually referred to as a class. 6 7

As an example, a subdivision of images into dogs and cats can be seen.

Regression

Regression is about using input data to predict one or more different output variables. 6

As an example, price predictions for stocks or historical development of various properties can be considered.

Application examples

Supervised Learning can be applied in various problem instances: 3 5 8

  1. classification of emails according to spam/non-spam.
  2. prediction of sales figures (predictive analytics).
  3. image and object recognition
  4. prediction of emotional states (sentiment analysis)
  5. speech recognition

In general, supervised learning is used to reduce the workload of repetitive tasks. Typically, this involves the classification of large amounts of data or assistance with complex tasks for employees. Human expertise is required to generate appropriate training data sets and to check trained models for correctness.

Sources

Footnotes

  1. aracom.de

  2. edureka.co

  3. ibm.com 2 3

  4. datasolut.com

  5. wikipedia.org 2

  6. builtin.com 2

  7. towardsdatascience.com

  8. openai.com