Statistics for Programmers - Frequency Distributions

A Frequency Distribution is a common way to understand a trend in a dataset. It's a tabular representation of the number of times a value appears in a dataset. If we denote values in a dataset as \(x_1, x_2, \ldots, x_n\), their corresponding frequencies can be denoted as \(f_1, f_2, \ldots, f_n\). This relationship can be expressed as a table.

\[ \begin{array}{|c|c|} \hline \text{Value (}x\text{)} & \text{Frequency (}f\text{)} \\ \hline x_1 & f_1 \\ x_2 & f_2 \\ \vdots & \vdots \\ x_n & f_n \\ \hline \end{array} \]

Applying this practically, let's consider a dataset of 10 users who were asked to review a product on a scale of 1 to 5. The dataset can be represented as an array of reviews.

[3, 1, 5, 5, 2, 4, 5, 3, 1, 5]

We can construct a frequency distribution table for this dataset by counting the number of times each unique element appears in the array.

Value (x) | Frequency (f)
-------------------------
1         | 2
2         | 1
3         | 2
4         | 1
5         | 4

This can be expressed in code using a Map (or Dictionary depending on your language of choice) of unique values and how many times they appear in a given dataset.

Once again considering our array of reviews,

const arr = [3, 1, 5, 5, 2, 4, 5, 3, 1, 5];

We can construct a function that counts the number of times each unique element appears in the array.

function frequencyDistribution(arr) {
  const map = {};
  for(let i = 0; i < arr.length; i++) {
      const item = arr[i];
      if (map[item]) {
          map[item] += 1;
      } else {
          map[item] = 1;
      }
  }
  return map;
}

Applying this function to our dataset gives us the following output,

This post is for subscribers only

Already have an account? Sign in.

Subscribe to Another Dev's Two Cents

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe