Back of the envelope math for Computers
Introduction
I rely constantly on back of the envelope math calculations or as Simon Eskildsen would say “napkin math”. This is something I usually do before I open a profiler, before I write a benchmark, before I argue about micro-optimizations with some colleagues. A rough mental model of computer speeds saves me from bad designs and wasted time. This is a practice very commonly used in Physics where you just look at the order of magnitude and try to model the behaviour of a system.
This post is my personal cheat sheet for latency and bandwidth across CPU, RAM, SSDs, and HDDs, plus a few examples that can be really useful when reviewing a design document or talking about system architecture.
The Numbers I Actually Remember
I do not remember exact specs. I remember orders of magnitude.
| Component | Latency (approx) | Bandwidth (approx) |
| --------------- | ---------------- | ------------------ |
| CPU register | ~0.3 ns | Enormous |
| L1 cache | ~1 ns | 1–2 TB/s |
| L2 cache | ~4 ns | Hundreds of GB/s |
| L3 cache | ~10–15 ns | 100+ GB/s |
| RAM | ~80–120 ns | 25–80 GB/s |
| NVMe SSD | ~50–150 µs | 3–7 GB/s |
| SATA SSD | ~100 µs | ~500 MB/s |
| HDD | ~5–10 ms | 100–200 MB/s |
What matters. Each level down the hierarchy costs roughly 10× more latency.
That single rule already explains most performance surprises.
Latency vs Bandwidth. Where I See People Get Tricked
Latency answers. How long until the first byte shows up?
Bandwidth answers. How fast can I stream once it does?
I have personally made the mistake of optimizing for bandwidth when latency was the real bottleneck.
This is why a cache miss hurts even when I am touching just a few bytes.
Math Examples
Example 1. The linked list mistake
Let’s look at an example of a linked list for a hot path because “the data structure was clean”.
10 million nodes, pointer chasing in RAM.
- One RAM access ≈ 100 ns
- 10M × 100 ns = 1 second
Try to replace it with a flat array.
- Sequential scan, prefetch works
- ~40 MB / 40 GB/s ≈ 1 millisecond
Same data. Three orders of magnitude difference. Sometimes having pointer-heavy structures in performance-critical code is not needed.
Example 2. “It’s just a disk read”
I once had a task doing ~100 random reads per request on an SSD.
- 100 × 100 µs = 10 ms
This looks fine if you take it by itself. Then traffic spiked.
Try to test the same workload but on an HDD and you see the following:
- 100 × 8 ms = 800 ms
That experiment permanently changed my intuition around random I/O.
Example 3. CPU cycles vs memory
On a 3 GHz CPU.
- 1 cycle ≈ 0.33 ns
- RAM access ≈ 300 cycles
- Branch misprediction ≈ 15 cycles
This teaches you that obsessing over instruction counts is useless if you forget about cache misses
Tiny C++ Benchmarks I Use for Reality Checks
These are not scientific benchmarks. I use them to keep my intuition honest.
Compile with optimizations enabled. -O2 or -O3. (play with them both)
Prototype code to perform back of envelope math: memory latency
#include <vector>
#include <chrono>
#include <iostream>
#include <random>
#include <algorithm>
int main() {
const size_t N = 1'000'000;
std::vector<size_t> next(N);
std::mt19937_64 rng(1234);
for (size_t i = 0; i < N; ++i) next[i] = i;
std::shuffle(next.begin(), next.end(), rng);
size_t idx = 0;
auto start = std::chrono::high_resolution_clock::now();
for (size_t i = 0; i < N; ++i) {
idx = next[idx];
}
auto end = std::chrono::high_resolution_clock::now();
auto ns = std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();
std::cout << "Average access latency: "
<< static_cast<double>(ns) / N << " ns\n";
}
Conclusion
Here are some mental rules I actually use:
- Cache misses dominate performance before arithmetic does
- Sequential access beats random access
- One disk seek equals millions of CPU instructions
Bandwidth affects throughput, latency affects responsiveness
Back of the envelope math is not about being precise. It is about being directionally correct early. When my estimate says milliseconds but reality says seconds, then you have some debugging to do :-)
Enjoy Reading This Article?
Here are some more articles you might like to read next: