Big O? Big "ohhhhhh" 💡

So you've just heard about this thing called big O. Maybe in some class you're taking on algorithmic complexity, or maybe someone asked you about the big O of the code you just wrote on a whiteboard (ugh, I know). You're not quite sure how this theoretical concept applies to your job, and you want to learn more. Well, you've come to the right place! 🙌

Summary

Big O is one of many ways to describe the growth of a function.
In particular, it describes an upper bound.
It hides information that is important for practical considerations.

📖 Story time

Let me start with a short story of how to not use big O. Early in my career, I was building out a remote development environment for the company I worked for. One part of this system required checking inclusion in a collection of things. I decided to use an array over a set, knowing the size of this collection was never going to be large. Given the small input size and cache-friendly nature of arrays, I decided it was the better option.

This one decision resulted in what was one of the longest, most difficult code reviews I had to work through, focused primarily on the reviewer wanting me to use a set over an array. They showed a clear lack of understanding of big O notation, arguing that I should make the change solely on the fact that inclusion checks in a set are $O(\text{lg } n)$ whereas inclusion in an array is $O(n)$ (technically, sets in Python also have this worst case complexity, although I didn't know that at the time). Even after showing a benchmark indicating the array performed significantly better for the input sizes we were to expect for the foreseeable future, they were hesitant to give me the thumbs up.

It was that experience that told me I should write this blog post (it also taught me about what not to do in a code review). Let's dive in!

🤔 What is big O?

Suppose we've implemented some function $f$ that computes a value from some input of size $n$ , and we want to understand how that function performs as $n$ gets bigger. There are many ways to understand growth, and big O is one of them. Big O defines an upper bound on the growth of a function, what we like to think of as the "worst case" scenario.

Formally, we say that $f$ is big O of some other function $g$ — written $f(n) = O(g(n))$ — if, for some input size $n$ :

f(n) \le C g(n) \enspace \exists \enspace C \enspace \forall \enspace n > n_0

That's a whole lot of math, but it boils down to one thing: $f$ is $O(g)$ if $g$ is always bigger than $f$ after some input size.

🧐 What does it look like?

Let's take a look at some functions:

Graph showing various functions and how they grow for larger inputs

Note $lg$ is the base 2 log , also written as $log_2$ . We can see how these functions grow, and how some grow much faster than others. Where would we see these in practice?

$O(\lg{n})$ : a divide and conquer search, like binary search . Known as logarithmic complexity.
$O(n)$ : a single loop over an array, like computing a sum. Known as linear complexity.
$O(n \cdot \lg{n})$ : a divide and conquer approach where each "division" involves looping over the divided part, like merge sort .
$O(n^2)$ : a nested loop over the array, like in a simple sort algorithm, such as bubble sort . Known as quadratic complexity.

Remember that $n_0$ term from the definition above, the point at which the big O function actually becomes an upper bound? That's really important to remember. We generally consider $O(n)$ to be better than $O(n^2)$ , but that's only true after a certain point. Consider the following:

Graph showing various functions and how they grow for larger inputs

Suppose these two functions describe the performance of two different algorithms we have to choose between, parameterized on the size $n$ of an input array. If our input array has less than 800 elements, the function that has quadratic complexity actually performs better! 🙀

But what does that even look like in code? Let's take a look at an example:

function estimateMax(array) {
  let max = 0;
  for (const subarray of array) {
    for (let sample = 0; sample < 800; ++sample) {
      max = Math.max(max, subarray[Math.floor(Math.random() * array.length)]);
    }
  }
  return max;
}

function actualMax(array) {
  let max = 0;
  for (const subarray of array) {
    for (const sample of subarray) {
      max = Math.max(max, element);
    }
  }
  return max;
}

Here we have two functions to compute the maximum value in an array of arrays. The first function, estimateMax, uses a random sampling approach to estimate the maximum value. The second function, actualMax, computes the actual maximum value.

Ignoring the cost of various operations like Math.max and Math.random, estimateMax will have a total cost of 800n and actualMax will have a total cost of n^2, where n is the length of the input array and its subarraysg. I'll leave it up to you to actually benchmark these functions as the array gets bigger and bigger. With enough trials, I'm hoping you'll observe that the tipping point for actualMax being slower will roughly be 800 elements 🤞

👷‍♀️ But how can I use this?

There are three practical considerations when incorporating big O into your work.

First, and foremost, we should consider our input sizes. If the size of our inputs will remain small or fixed, big O is not as useful a tool. With small inputs, things like CPU caches will likely have a significant impact. This point is really important to remember because it is often overlooked.

Big O is a great tool to use to compare two approaches when we do have large, arbitrary input sizes. It's important to know that big O is also an upper bound, but not necessarily a tight upper bound. What that means is that a function that is $O(n^2)$ could be $O(n)$ , but no one has proven that yet.

The last consideration is to benchmark our code to prove that one approach is better than another. In particular, benchmarking over various input sizes and shapes will give us confidence we've made the right decision.