This module introduces linear regression concepts.
Linear regression is a statistical technique used to find the relationship between variables. In an ML context, linear regression finds the relationship between features and a label.
For example, suppose we want to predict a car's fuel efficiency in miles per gallon based on how heavy the car is, and we have the following dataset:
Pounds in 1000s (feature) | Miles per gallon (label) |
---|---|
3.5 | 18 |
3.69 | 15 |
3.44 | 18 |
3.43 | 16 |
4.34 | 15 |
4.42 | 14 |
2.37 | 24 |
If we plotted these points, we'd get the following graph:
Figure 1. Car heaviness (in pounds) versus miles per gallon rating. As a car gets heavier, its miles per gallon rating generally decreases.
We could create our own model by drawing a best fit line through the points:
Figure 2. A best fit line drawn through the data from the previous figure.
Linear regression equation
In algebraic terms, the model would be defined as $ y = mx + b $, where
- $ y $ is miles per gallon—the value we want to predict.
- $ m $ is the slope of the line.
- $ x $ is pounds—our input value.
- $ b $ is the y-intercept.
In ML, we write the equation for a linear regression model as follows:
where:
- $ y' $ is the predicted label—the output.
- $ b $ is the bias of the model. Bias is the same concept as the y-intercept in the algebraic equation for a line. In ML, bias is sometimes referred to as $ w_0 $. Bias is a parameter of the model and is calculated during training.
- $ w_1 $ is the weight of the feature. Weight is the same concept as the slope $ m $ in the algebraic equation for a line. Weight is a parameter of the model and is calculated during training.
- $ x_1 $ is a feature—the input.
During training, the model calculates the weight and bias that produce the best model.
Figure 3. Mathematical representation of a linear model.
In our example, we'd calculate the weight and bias from the line we drew. The bias is 30 (where the line intersects the y-axis), and the weight is -3.6 (the slope of the line). The model would be defined as $ y' = 30 + (-3.6)(x_1) $, and we could use it to make predictions. For instance, using this model, a 4,000-pound car would have a predicted fuel efficiency of 15.6 miles per gallon.
Figure 4. Using the model, a 4,000-pound car has a predicted fuel efficiency of 15.6 miles per gallon.
Models with multiple features
Although the example in this section uses only one feature—the heaviness of the car—a more sophisticated model might rely on multiple features, each having a separate weight ($ w_1 $, $ w_2 $, etc.). For example, a model that relies on five features would be written as follows:
$ y' = b + w_1x_1 + w_2x_2 + w_3x_3 + w_4x_4 + w_5x_5 $
For example, a model that predicts gas mileage could additionally use features such as the following:
- Engine displacement
- Acceleration
- Number of cylinders
- Horsepower
This model would be written as follows:
Figure 5. A model with five features to predict a car's miles per gallon rating.
By graphing some of these additional features, we can see that they also have a linear relationship to the label, miles per gallon:
Figure 6. A car's displacement in cubic centimeters and its miles per gallon rating. As a car's engine gets bigger, its miles per gallon rating generally decreases.
Figure 7. A car's acceleration and its miles per gallon rating. As a car's acceleration takes longer, the miles per gallon rating generally increases.
Figure 8. A car's horsepower and its miles per gallon rating. As a car's horsepower increases, the miles per gallon rating generally decreases.