Many problems require a probability estimate as output. Logistic
regression is an extremely efficient mechanism for calculating
probabilities. Practically speaking, you can use the returned
probability in either of the following two ways:

"As is"

Converted to a binary category.

Let's consider how we might use the probability "as is." Suppose we
create a logistic regression model to predict the probability that a
dog will bark during the middle of the night. We'll call that
probability:

\[p(bark | night)\]

If the logistic regression model predicts \(p(bark | night) = 0.05\),
then over a year, the dog's owners should be startled awake approximately
18 times:

In many cases, you'll map the logistic regression output into the solution
to a binary classification problem, in which the goal is to correctly
predict one of two possible labels (e.g., "spam" or "not spam"). A later
module
focuses on that.

You might be wondering how a logistic regression model can ensure
output that always falls between 0 and 1. As it happens,
a sigmoid function, defined as follows, produces output having
those same characteristics:

$$y = \frac{1}{1 + e^{-z}}$$

The sigmoid function yields the following plot:

Figure 1: Sigmoid function.

If \(z\) represents the output of the linear layer of a model trained with
logistic regression, then \(sigmoid(z)\) will yield a value (a probability)
between 0 and 1. In mathematical terms:

$$y' = \frac{1}{1 + e^{-z}}$$

where:

\(y'\) is the output of the logistic regression model for a particular
example.

\(z = b + w_1x_1 + w_2x_2 + \ldots + w_Nx_N\)

The \(w\) values are the model's learned weights, and \(b\) is the
bias.

The \(x\) values are the feature values for a particular example.

Note that \(z\) is also referred to as the log-odds because the inverse of
the sigmoid states that \(z\) can be defined as the log of the probability of
the \(1\) label (e.g., "dog barks") divided by the probability of the \(0\)
label (e.g., "dog doesn't bark"):

$$ z = \log\left(\frac{y}{1-y}\right) $$

Here is the sigmoid function with ML labels:

Figure 2: Logistic regression output.

Click the plus icon to see a sample logistic regression inference calculation.

Suppose we had a logistic regression model with three features that
learned the following bias and weights: