Start with the vocabulary pieces.
Enter a short prompt and inspect where words split into tokens. Notice spaces, punctuation, and fragments before assuming a token is the same thing as a word.
A self-hosted copy by Cho, Kim, Karpekov, Helbling, Wang, Lee, Hoover, and Chau. The upstream project is MIT licensed.
Use the embedded explainer below like a microscope: follow one prompt from tokenization through attention, logits, and sampling before changing the next control.
Enter a short prompt and inspect where words split into tokens. Notice spaces, punctuation, and fragments before assuming a token is the same thing as a word.
Move across layers and heads, then ask what each selected token is borrowing from the left context. Causal masking means the model can explain the past, not peek at the future.
Before sampling, compare the highest-scoring next tokens. The model is not choosing a sentence at once; it is scoring one next token from the current state.
Lower temperature to sharpen the top choice, then raise it to admit alternatives. If every option is bad, fix the prompt or context before treating temperature as the answer.