LLM4N: LLMs for Noobs

A new series by DataPeCharcha

Your 16-week journey to finally understand Large Language Models. No jargon, just clarity on how models like GPT-4 *really* work.

Section I: The Ancestors of LLMs

WEEK 1

Before AI: Counting Words with N-Grams

We start at the very beginning, exploring how early models predicted text by simply counting which words appeared together most often. It's the simple, statistical foundation for everything that followed.

WEEK 2

Giving Words Meaning with Neural Networks

This week, we see the first spark of true 'understanding.' Learn how we taught computers that 'dog' and 'puppy' are related, and how RNNs tried to give models a basic memory.

WEEK 3

Fixing a Broken Memory with LSTMs

Early neural nets were forgetful. Discover the clever 'gate' system in LSTMs that gave models a reliable long-term memory, a breakthrough that dominated NLP for years.

Section II: The Transformer Architecture

WEEK 4

How AI Reads: The Art of Tokenization

Computers don't read words, they read numbers. We'll break down the essential process of 'tokenization'—turning text into a language machines can finally understand.

WEEK 5

Setting the Stage: Embeddings & Position

Before the magic happens, we have to prepare the data. Learn how a Transformer gives each word its initial meaning and, crucially, a 'timestamp' so it knows the order of the sentence.

WEEK 6

The Core Idea: How Self-Attention Works

This is the revolutionary concept that changed everything. We'll use simple analogies to explain how words 'talk' to each other to figure out the true context of a sentence.

WEEK 7

Upgrading to Multi-Head Attention

One 'conversation' between words isn't enough. See how Multi-Head Attention allows words to look at the sentence from many different perspectives at once for a richer understanding.

WEEK 8

The Complete Transformer Block

Assemble the full building block by combining multi-head attention with the position-wise feed-forward network and the stabilization tricks that make deep networks possible.

Section III: The Learning Process

WEEK 9

The Training Game: Learning to Predict

How does an LLM learn from the entire internet? We'll cover the simple goal of 'predicting the next word' and the core training loop that makes it all possible.

WEEK 10

Teaching Models to Follow Instructions

A raw model is just a text predictor. Learn about Supervised Fine-Tuning (SFT), the process that turns it into a helpful instruction-follower.

WEEK 11

Making it Safe & Helpful with RLHF

How do we align models with human values? We'll explore Reinforcement Learning from Human Feedback, the advanced technique that makes models helpful and harmless.

WEEK 12

How AI Writes: The Predictable Methods

Once trained, how does a model choose its words? We'll look at the straightforward, deterministic methods that form the basis of text generation.

WEEK 13

How AI Writes: Adding Creativity

To sound less robotic, models need controlled randomness. Learn about temperature and top-k sampling for more diverse, human-like writing.

Section IV: Modern Architectures

WEEK 14

Working Smarter: Mixture of Experts

Making models bigger is expensive. Discover the Mixture of Experts architecture—a clever 'committee of specialists' that scales to trillions of parameters efficiently.

WEEK 15

The Future is Multimodal

We'll explore how LLMs are breaking the text barrier to understand images, audio, and video, moving toward human-like world understanding.

WEEK 16

Talking to AI: Prompt Engineering Fundamentals

Learn how to communicate with LLMs effectively. We'll cover techniques for getting better responses and common pitfalls to avoid in real-world usage.