How Logistic regression differ from the Linear regression
What is The Sigmoid Function?
What is The Activation Function and Why We Use It?
What is The Significance Log of Odds Here?
Learning the Logistic regression model:
Logistic regression is also a typical regression problem. It is used in Machine Learning. However, it is usually used for the purpose of classification with logistic function. It uses probabilities instead of the actual value.
Unlike Linear regression, the dependent variables can only
take a limited number of values i.e., the dependent variable is “categorical”
rather the continuous values.
When there are only two possible outcomes: spam or non-spam e-mail. Then it is known as “Binary Logistic Regression”.
A Binary logistic model has a dependent variable with two
possible value like Pass/Fail, Yes/No, Male/Female, Healthy/Sick, Win/Lose
etc. which are represented as 0 or 1.
There may be “n” number of independent variables,
each of type Binary or Continuous.
Note: The main difference is Linear regression predicts continuous values whereas Logistic regression uses in binary classification
For examples :
1)
Predicting the probability of failure of a
product.
2)
Employees Churn prediction.
3)
Cancer prediction.
4)
Likelihood of a customer purchasing particular
products based on his previous purchases.
5)
Predicting mortality of the injured patients.
Let’s see in detailed how Logistic regression differ from
the Linear regression
In Linear regression, the output is the weighted sum of the
inputs.
Logistic Regression is a generalized Linear Regression in
the sense that we don’t output the weighted sum of the inputs directly, but we
pass it through a function that can map any value between 0 and 1.
As of now, you understood some basic idea behind Logistic
regression. But may you have a doubt that
Why can’t Linear regression can be used for classification
tasks?
Let me give a broad picture so that you get an idea behind
it.
Note: If we take the weighted sum of inputs as the output
as we do in Linear regression, the value can be more than 1, but we want a value between 0 and 1. Because of this linear regression does not use for binary classification.
Now let’s understand it with the help of a diagram
In the above figure, the left-hand side represents a neuron
representation of the Linear Regression model and right-hand side represent
Logistic regression model.
So, as you can see in the picture, that is the only difference between Linear Regression and Logistic Regression is that the results of these linear regression models are passed through the “activation function”. i.e. Sigmoid a function that maps any real value between 0 and 1.
Often, sigmoid function refers to the special case of the logistic function is defined by the formula:
Some of the important point to be noted
We see that the value of the sigmoid function is a number between 0 and 1.
This is exactly 0.5, x = 0.
We can use 0.5 as the probability threshold to determine the
classes.
If the probability is greater than 0.5, then it is class 1 else it is class 2.
NOTE: We can set the threshold value as per the problem
statements.
What is The Activation Function and Why We Use It?
The activation function mainly helps to decide whether a neuron should be activated or not by calculating the weighted sum and further adding bias with it. The goal of the activation of the function is to set the non-linearity of the output neuron.
Here I will not go into the details of activation functions.
Later I will try to write a separate article only on the topic of activation
functions.
I think you may be pretty clear about the main difference
between Linear Regression and Logistic Regression and where and where not both
can be used.
So, let’s move further see the intuition behind it
So as before only I already mentioned that logistic
regression uses the Sigmoid function to transform linear regression into the logit
function.
Logit is nothing but it’s the logofOdds. And then
using this log of Odds we calculate the required probability.
So, let’s understand first what the log of Odds is?
The odds ratio is calculated by dividing the probability of an event and the probability that it will happen. And then taking the log of Odds ratio will give the log of Odds.
The odds ratio can be also stated
that it’s the ratio of the probability of success to the probability of failure and
taking a log of this ratio gives us the log of odds.
What is The Significance Log of Odds Here?
Logistic function or sigmoid function can be converted into an Odds ratio:
The above equation is known as “odds-ratio”.
To obtain a logistic regression equation we taking a log of the linear regression equation.
Learning the Logistic regression model:
The coefficients of the logistic algorithm must
be calculated from training data. This is done using “maximum-
likelihood estimation”.
Maximum Likelihood estimation is a method of estimating the parameters of a probability distribution by maximizing a likelihood function.
The best coefficients would result in a model that would that
would predict a value very close to 1 (e.g. male) for the default class and a
value very close to 0 (e.g. female).