Demo 03

Scaling intuition for people who need to make product decisions.

Compare million, billion, and trillion; then move the knobs that turn model size, token volume, and training effort into real work.

Seconds

The words sound adjacent, but the jumps are not linear. Each step below is one thousand times larger than the previous one.

million

11.6 days

1,000,000 seconds is roughly a long sprint.

billion

31.7 years

1,000,000,000 seconds is roughly a career.

trillion

31,688 years

1,000,000,000,000 seconds is roughly older than agriculture.

million to billion

1,000x

billion to trillion

1,000x

million to trillion

1,000,000x

Compute

This is a relative calculator, not a vendor price sheet. It shows why model size, sequence length, and output length quickly dominate latency and cost.

model size70B - large open-weight classinput context32K tokensgenerated output1K tokens

work multiplier

1Kx

Relative to a 1B model with a short request in the same mode.

active tokens

33K

weight memory

~140 GB

compute proxy

5M billion parameter-token ops

10x work

A common scaling-law shape is a straight line on a log chart: each equal quality step can require a multiplicative increase in data, model size, compute, or careful task work. That is the intuition behind "10x more work buys one unit."

hypothetical effort

10x

improvement

Takeaway

Scale works because more work usually reduces error. It also gets expensive fast because the axes multiply: parameters times tokens times passes through the data times traffic.

The useful lesson is not "always scale." It is to spend the cheap rungs first, measure the task, and only buy the next order of magnitude when the expected improvement is worth it.