Statistics For Programmers  Introduction to Probability
Probability, in the realm of statistics, is a measure of the likelihood that a specific event will occur. It provides us with a way to quantify uncertainty. We see and characterize this in our everyday lives. For example, when we say that there is a 70% chance of rain, or a user has a 20% chance of clicking on a button, we are expressing probability.
Key Terminology
To precisely discuss probability, we need to understand some key terms.

Experiment  An experiment is an activity or process that results in an outcome that we can measure. For example, rolling a die, flipping a coin, or conducting a survey are all examples of experiments.

Sample Space (S):  The sample space is the set of all possible outcomes of an experiment. For example, when rolling a die, the sample space is {1, 2, 3, 4, 5, 6}.

Event (E):  An event is a subset of the sample space. It is a specific outcome or a set of outcomes of an experiment. If you're rolling a die, the event of getting an even number is {2, 4, 6}.

Probability (P):  A Probability is a number between 0 and 1 (inclusive) that quantifies the likelihood of an event occurring. A probability of 0 means the event will not happen, while a probability of 1 means the event is guaranteed to happen.
There are two main types of probability: Classical and Empirical. We determine which type to use based on the nature of the experiment.
Classical Probability
We turn to Classical probability when we know all of the possible outcomes of the event of interest and all outcomes in the sample space are equally likely. It is calculated using the formula:
\[ P(E) = \frac{n(E)}{n(S)} \]
Where:
 \( P(E) \) is the probability of event \( E \),
 \( n(E) \) is the number of outcomes of the event \( E \),
 \( n(S) \) is the number of possible outcomes in the sample space.
For example, when rolling a die, to calculate the probability of getting a 3 we begin by defining the sample space and the event.
The sample space is every posibile outcome of rolling the die \( S = {1, 2, 3, 4, 5, 6} \), meaning there are (6) possible outcomes. The event is the specific outcome of interest \( E = {3} \). Pluging this into our formula probability of rolling a 3 is:
\[ P(E) = \frac{1}{6} \]
Empirical Probability
Empirical probability is used when we don't know all the possible outcomes of an event, or when the outcomes are not equally likely. Under this circumstance, we determine the probability of an event by conducting an experiment and observing the frequency of the event. This is effectively a Relative Frequency distribution of the event.
\[ P(E) = \frac{\text{Number of times event E occurs}}{\text{Total number of trials}} \]
For example, let's assume we wanted to calculate the probability of a user clicking a button in an application. We know the possible outcomes ahead of time (the user either clicks the button or doesn't), but we don't know every single factor that could influence the user's decision. This means that the outcomes are not equally likely.
To calculate the probability that a user would click our button, we can conduct an experiment aimed at gathering data by showing the button to 100 users and observing how many times it was clicked. If it was clicked 20 times.
'clicked' : 20
'not clicked' : 80
The probability of a user clicking on the button is:
\[ P(E) = \frac{20}{100} = 0.2 \]
The accuracy of Empirical probability is directly proportional to the number of trials conducted as well as the size of the sample space. The larger the sample space and the number of trials, the more likely the empirical probability will converge the true or classical probability. This is known as the Law of Large Numbers.
Probability as Code
Computing probability as code is quite straight forward using techniques we've already discussed. We begin by deriving a frequency distribution from our dataset and then calculating the probability of the event of interest.
Let's assume our dataset is an array of user interactions with the button:
// 20 clicked, 80 not clicked events
const data = ['clicked', 'not clicked', 'clicked', 'not clicked', 'clicked', 'not clicked', 'clicked', 'not clicked', 'clicked', 'not clicked' ... ];