Lesson 5 of 13By Simplilearn

Last updated on Jul 23, 202067368#### Machine Learning Tutorial

Overview#### What is Machine Learning and How Does It Work?

Lesson - 1#### Real-World Machine Learning Applications That Will Blow Your Mind

Lesson - 2#### Supervised and Unsupervised Learning

Lesson - 3#### Linear Regression in Python

Lesson - 4#### An Introduction to Logistic Regression in Python

Lesson - 5#### Random Forest Algorithm

Lesson - 6#### Understanding Naive Bayes Classifier

Lesson - 7#### K-Means Clustering Algorithm: Applications, Types, Demos and Use Cases

Lesson - 8#### The Best Guide On How To Implement Decision Tree In Python

Lesson - 9#### How to Become a Machine Learning Engineer?

Lesson - 10#### Embarking on a Machine Learning Career? Here’s All You Need to Know

Lesson - 11#### Top 34 Machine Learning Interview Questions & Answers for 2020

Lesson - 12#### KNN in Python: Learn How to Leverage KNN Algorithms

Lesson - 13

Machine learning has revolutionized the world of business and is helping us build sophisticated applications to solve tough business problems. Using supervised and unsupervised machine learning models, you can solve problems using classification, regression, and clustering algorithms. In this article, we’ll discuss a supervised machine learning algorithm known as logistic regression in Python. Logistic regression can be used to solve both classification and regression problems.

The following are topics that this article will cover:

- Introduction to supervised learning
- What is logistic regression?
- Advantages of the logistic regression algorithm
- How does the logistic regression algorithm work?
- Comparing logistic regression and linear regression
- Applications of a logistic regression classifier
- Use case: Predict the digits in images using a logistic regression classifier in Python

Supervised machine learning algorithms derive insights, patterns, and relationships from a labeled training dataset. It means the dataset already contains a known value for the target variable for each record. It is called supervised learning because the process of an algorithm learning from the training dataset is like an instructor supervising the learning process. You know the correct answers, the algorithm iteratively makes predictions on the training data and the instructor corrects it. Learning ends when the algorithm achieves the desired level of performance and accuracy.

Supervised learning problems can be further classified into regression and classification problems.

**Classification**: In a classification problem, the output variable is a category, such as “red” or “blue,” “disease” or “no disease,” “true” or “false,” etc.**Regression**: In a regression problem, the output variable is a real continuous value, such as “dollars” or “weight.”

The following is an example of a supervised learning method where we have labeled data to identify dogs and cats. The algorithm learns from this data and trains a model to predict the new input.

Now that we learned the basics of supervised learning, let's have a look at a popular supervised machine learning algorithm: logistic regression.

Logistic regression is a statistical method that is used for building machine learning models where the dependent variable is dichotomous: i.e. binary. Logistic regression is used to describe data and the relationship between one dependent variable and one or more independent variables. The independent variables can be nominal, ordinal, or of interval type.

The name “logistic regression” is derived from the concept of the logistic function that it uses. The logistic function is also known as the sigmoid function. The value of this logistic function lies between zero and one.

The following is an example of a logistic function we can use to find the probability of a vehicle breaking down, depending on how many years it has been since it was serviced last.

Here is how you can interpret the results from the graph to decide whether the vehicle will break down or not.

- Logistic regression performs better when the data is linearly separable
- It does not require too many computational resources as it’s highly interpretable
- There is no problem scaling the input features—It does not require tuning
- It is easy to implement and train a model using logistic regression
- It gives a measure of how relevant a predictor (coefficient size) is, and its direction of association (positive or negative)

Consider the following example: An organization wants to determine an employee’s salary increase based on their performance.

For this purpose, a linear regression algorithm will help them decide. Plotting a regression line by considering the employee’s performance as the independent variable, and the salary increase as the dependent variable, will make their task easier.

Now, what if the organization wants to know whether an employee would get a promotion or not based on their performance? The above linear graph won’t be suitable in this case. As such, we clip the line at zero and one, and convert it into a sigmoid curve (S curve).

Based on the threshold values, the organization can decide whether an employee will get a salary increase or not.

To understand logistic regression, let’s go over the odds of success.

Odds (𝜃) = Probability of an event happening / Probability of an event not happening

𝜃 = p / 1 - p

The values of odds range from zero to ∞ and the values of probability lies between zero and one.

Consider the equation of a straight line:

𝑦 = 𝛽0 + 𝛽1* 𝑥

Here, 𝛽0 is the y-intercept

𝛽1 is the slope of the line

x is the value of the x coordinate

y is the value of the prediction

Now to predict the odds of success, we use the following formula:

Exponentiating both the sides, we have:

Let Y = e 𝛽0+𝛽1 * 𝑥

Then p(x) / 1 - p(x) = Y

p(x) = Y(1 - p(x))

p(x) = Y - Y(p(x))

p(x) + Y(p(x)) = Y

p(x)(1+Y) = Y

p(x) = Y / 1+Y

The equation of the sigmoid function is:

The sigmoid curve obtained from the above equation is as follows:

Now that you know more about logistic regression algorithms, let’s look at the difference between linear regression and logistic regression.

Linear Regression |
Logistic Regression |

Used to solve regression problems |
Used to solve classification problems |

The response variables are continuous in nature |
The response variable is categorical in nature |

It helps estimate the dependent variable when there is a change in the independent variable |
It helps to calculate the possibility of a particular event taking place |

It is a straight line |
It is an S-curve (S = Sigmoid) |

Now, let’s look at some logistic regression algorithm examples.

- Using the logistic regression algorithm, banks can predict whether a customer would default on loans or not
- To predict the weather conditions of a certain place (sunny, windy, rainy, humid, etc.)
- Ecommerce companies can identify buyers if they are likely to purchase a certain product
- Companies can predict whether they will gain or lose money in the next quarter, year, or month based on their current performance
- To classify objects based on their features and attributes

Now, let’s look at the assumptions you need to take to build a logistic regression model.

- In a binary logistic regression, the dependent variable must be binary
- For a binary regression, the factor level one of the dependent variables should represent the desired outcome
- Only meaningful variables should be included
- The independent variables should be independent of each other. This means the model should have little or no multicollinearity
- The independent variables are linearly related to the log odds
- Logistic regression requires quite large sample sizes

Let’s now jump into understanding the logistics Regression algorithm in Python.

We’ll be using the digits dataset in the scikit learn library to predict digit values from images using the logistic regression model in Python.

- Importing libraries and their associated methods

- Determining the total number of images and labels

- Displaying some of the images and their labels

- Dividing dataset into “training” and “test” set

- Importing the logistic regression model

- Making an instance of the model and training it

- Predicting the output of the first element of the test set

- Predicting the output of the first 10 elements of the test set

- Prediction for the entire dataset

- Determining the accuracy of the model

- Representing the confusion matrix in a heat map

- Presenting predictions and actual output

The images above depicts the actual numbers and the predicted digit values from our logistic regression model.

Are you an AI and Machine Learning enthusiast? If yes, the Post Graduate Program in AI and Machine Learning is a perfect fit for your career growth.

We hoped that this article has helped you get acquainted with the basics of supervised learning and logistic regression. We covered the logistic regression algorithm, and went into detail with an elaborate example. Then, we looked at the different applications of logistic regression, followed by the list of assumptions you should make to create a logistic regression model. Finally, we built a model using the logistic regression algorithm to predict the digits in images.

If you want to kick off a career in this exciting field, check out Simplilearn’s Artificial Intelligence Engineer Master’s Program, offered in collaboration with IBM. The program enables you to dive much deeper into the concepts and technologies used in AI, machine learning and deep learning. You will also get to work on an awesome Capstone Project and earn a certificate in all disciplines in this exciting and lucrative field.

Name | Date | Place | |
---|---|---|---|

Machine Learning | 28 Aug -2 Oct 2020, Weekdays batch | Your City | View Details |

Machine Learning | 4 Sep -9 Oct 2020, Weekdays batch | New York City | View Details |

Machine Learning | 26 Sep -31 Oct 2020, Weekend batch | San Francisco | View Details |

Simplilearn is one of the world’s leading providers of online training for Digital Marketing, Cloud Computing, Project Management, Data Science, IT, Software Development, and many other emerging technologies.

Machine Learning

9421 Learners

Lifetime Access*

*Lifetime access to high-quality, self-paced e-learning content.

Explore Category- Video Tutorial
Linear Regression in Python

- Ebook
Machine Learning Career Guide: A complete playbook to becoming a Machine Learning Engineer

- Article
Top 10 Machine Learning Algorithms You Need to Know in 2020

- Video Tutorial
Logistic Regression in R: The Ultimate Tutorial with Examples

- Video Tutorial
Getting Started with Linear Regression in R

- Ebook
Introduction to Machine Learning: A Beginner's Guide

- Disclaimer
- PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.