What is Linear Regression?
It is a method used for predicting future values by finding a linear pattern in the previously given data. The linear pattern in the data can be represented by a best-fit line on the graph.
This line helps in pointing to the right direction and predicting the position of future data-points on the graph.
How does Linear Regression work?
The pattern in any dataset occurs when there is a relationship between two or more variables. Usually between X and Y axis.
How to find the best-fit line?
The best-fit/regression line is just a line having all the properties of any simple line. For example 1. Every line has a Slope.
‘ only tells us the inclined or declined the data which is between -1 and 1. The slope does not give the position of line. slope ‘m
222. Then, the Slope can be used to find the equation for the line.
Also, Y is the dependent variable and X is the independent variable because Y is output for the input X.
where, m represents the slope and
b represents the point (0,Y) at which regression line cuts Y-axis.
So, all we need to draw the regression line is to find the value of its slope (M) & Y-intercept (B) to complete the line equation.
1. Slope (M) with many data points
But when the graph is like this, Which two points should we consider to find the slope of line?
There are too many points to consider. Instead of finding two points from the data, we will consider all the data points present on the graph.
Since our main goal is to find the slope of regression line.
By taking into account all the data points of X & Y axis on the graph. We can understand the relation between X and Y axis data points by calculating the ratio of
Co-variance tells us how the relation between X and Y variables.
Step deviation tells us how close together are the data points.
Now that we know the slope of the regression line, we only need the a data point from which the slope can pass through.
2. Y- intercept (b)
Whatever the slope of the line is… it will always going to pass through the Y-axis where the point will be (0,Y). We can find the value of Y as
As you can see we are using the average or mean of the data points from both axis to get the intercept point.
Thus, the linear regression line can be drawn by completing the simple equation.
When should you use Linear Regression?
- The standard deviation of X and Y axis data points are same.
- The variables in the dataset have a relation between them and a variable (say y) is dependent on one (say x) or more variables. This leads to increase in dimensions.