Big O? Big "ohhhhhh" 💡

10 minute read

So you've just heard about this thing called big O. Maybe in some class you're taking on algorithmic complexity, or maybe someone asked you about the big O of the code you just wrote on a whiteboard (ugh, I know). You're not quite sure how this theoretical concept applies to your job, and you want to learn more. Well, you've come to the right place! 🙌

Summary

  • Big O is one of many ways to describe the growth of a function.
  • In particular, it describes an upper bound.
  • It hides information that is important for practical considerations.

📖 Story time

Let me start with a short story of how to not use big O. Early in my career, I was building out a remote development environment for the company I worked for. One part of this system required checking inclusion in a collection of things. I decided to use an array over a set, knowing the size of this collection was never going to be large. Given the small input size and cache-friendly nature of arrays, I decided it was the better option.

This one decision resulted in what was one of the longest, most difficult code reviews I had to work through, focused primarily on my tech lead wanting me to use a set over an array. They showed a clear lack of understanding of big O notation, arguing that I should make the change solely on the fact that inclusion checks in a set are O(lg n)O(\text{lg } n) whereas inclusion in an array is O(n)O(n). Even after showing a benchmark indicating the array performed significantly better for the input sizes we were to expect for the foreseeable future, they were hesitant to give me the thumbs up.

It was that experience that told me I should write this blog post (it also taught me about what not to do in a code review). Let's dive in!

🤔 What is big O?

Suppose we've implemented some function ff that computes a value from some input of size nn, and we want to understand how that function performs as nn gets bigger. There are many ways to understand growth, and big O is one of them. Big O defines an upper bound on the growth of a function, what we like to think of as the "worst case" scenario.

Formally, we say that ff is big O of some other function gg — written f(n)=O(g(n))f(n) = O(g(n)) — if, for some input size nn:

f(n)<=Cg(n)for some constant C,for all n>n0f(n) <= C g(n) \quad \text{for some constant } C, \text{for all } n > n_0

That's a whole lot of math, but it boils down to one thing: ff is O(g)O(g) if gg is always bigger than ff after some input size.

🧐 What does it look like?

Let's take a look at some functions:

Note lglg is the base 2 log , also written as log2log_2. We can see how these functions grow, and how some grow much faster than others. Where would we see these in practice?

  • O(lgn)O(\lg{n}): a divide and conquer  search, like binary search . Known as logarithmic complexity.
  • O(n)O(n): a single loop over an array, like computing a sum. Known as linear complexity.
  • O(nlgn)O(n \cdot \lg{n}): a divide and conquer approach where each "division" involves looping over the divided part, like merge sort .
  • O(n2)O(n^2): a nested loop over the array, like in a simple sort algorithm, such as bubble sort . Known as quadratic complexity.

Remember that n0n_0 term from the definition above, the point at which the big O function actually becomes an upper bound? That's really important to remember. We generally consider O(n)O(n) to be better than O(n2)O(n^2), but that's only true after a certain point. Consider the following:

Suppose these two functions describe the performance of two different algorithms we have to choose between, parameterized on the size nn of an input array. If our input array has less than 800 elements, the function that has quadratic complexity actually performs better! 🙀

👷‍♀️ But how can I use this?

There are three practical considerations when incorporating big O into your work.

First, and foremost, we should consider our input sizes. If the size of our inputs will remain small or fixed, big O is not as useful a tool. With small inputs, things like CPU caches  will likely have a significant impact. This point is really important to remember because it is often overlooked.

Big O is a great tool to use to compare two approaches when we do have large, arbitrary input sizes. It's important to know that big O is also an upper bound, but not necessarily a tight upper bound. What that means is that a function that is O(n2)O(n^2) could be O(n)O(n), but no one has proven that yet.

The last consideration is to benchmark our code to prove that one approach is better than another. In particular, benchmarking over various input sizes and shapes will give us confidence we've made the right decision.