1. Preface
  2. Setting Up the Environment
  3. Week 1: From Matmul to Text
    1. Attention and Multi-Head Attention
    2. Positional Embeddings and RoPE
    3. Grouped/Multi Query Attention
    4. Multilayer Perceptron Layer and Transformer
    5. Wiring the Qwen2 Model
    6. Loading the Model
    7. Generating the Response
  4. Week 2: Optimizing
  5. Week 3: Serving
  6. Glossary Index