info snack
NVIDIA A100
The older workhorse GPU for many production inference and training jobs.
companion card
A100s are older 40GB/80GB workhorse GPUs, often around $1-3 per GPU-hour, good for many 7B-70B inference jobs.
What it means
A100 GPUs usually appear with 40GB or 80GB of VRAM. They are less new than Hopper or Blackwell chips but remain common in cloud fleets.
Why product teams care
They can serve many small and mid-sized models, embeddings, rerankers, and quantized larger models. Very large models may need multiple GPUs or a newer memory profile.
Understudy angle
Understudy can prove whether an A100 route is good enough before spending on H100 or H200 capacity.
take this with you
A100s are still useful when memory and throughput fit the workload and the price is right.
Try the cheapest GPU that clears your eval, latency, and reliability bar.