LLM Temperature, Top-P, and Top-K Explained — With Python Simulations
Build temperature, top-p, top-k, and min-p sampling from scratch in Python. Interactive code, probability visuals, and a per-task cheat sheet.
Build temperature, top-p, top-k, and min-p sampling from scratch in Python. Interactive code, probability visuals, and a per-task cheat sheet.
Build a working BPE tokenizer in Python step by step. Learn how LLMs split text into tokens, implement byte pair encoding, and count tokens...
Learn to build a Python AI chatbot with conversation memory using LangChain and LangGraph. Master 5 memory strategies, add a Streamlit UI, and persist...
Learn to use the OpenAI API in Python. Master chat completions, streaming responses, error handling, retries, and async calls with runnable examples.
Master Polars with 101 hands-on exercises and solutions — covering DataFrames, groupby, joins, window functions, lazy eval, and more.
Build a text classifier that hits 90%+ accuracy using only prompt engineering. Learn zero-shot, one-shot, and few-shot prompting with hands-on Python examples.
Master prompt engineering basics with Python. Learn the Role-Task-Format framework, zero-shot prompting, and build a testing harness that measures prompt accuracy.
Build an LLM benchmarking platform in Python from scratch. Define test suites, compare providers with raw HTTP, score with LLM-as-judge, and generate reports with...
Build a zero-dep LLM toolkit in Python — unified client for OpenAI, Claude, Gemini, Ollama with cost tracking, retry, streaming, and structured output.
Build a Python benchmark harness comparing Groq, Fireworks, Together AI, and Replicate on latency, throughput, and cost with runnable code.
Master the OpenAI Batch API in Python: build a reusable pipeline for 10,000+ prompts at 50% cost with JSONL formatting, progress polling, and error...
Build a Constitutional AI safety loop in Python. Generate, critique, and revise LLM responses with raw API calls — no fine-tuning needed.
Learn to train custom BPE and WordPiece tokenizers with HuggingFace for medical, legal, and domain-specific NLP. Includes evaluation metrics and code.
Benchmark tiktoken vs HuggingFace Tokenizers on speed, vocabulary, and encoding. Runnable Python code, migration guide, and decision framework for your LLM apps.
Learn how KV caching works in LLMs, calculate VRAM usage for real models, and build a PagedAttention-style cache manager with token eviction in pure...
Build a speculative decoding simulator in Python. Learn the draft-verify algorithm, measure acceptance rates, and understand when it speeds up LLM inference.
Measure and fix LLM position bias with Python. Build a needle-in-haystack test, plot the U-shaped curve, and implement position reordering.
Build a sliding window summarization pipeline in Python that handles documents exceeding LLM context windows. Map-reduce and recursive strategies with raw HTTP API calls.
Build a Mamba SSM layer from scratch in NumPy. Learn selective state spaces, implement selective scan, and benchmark SSM vs attention scaling with runnable...
Build a Mixture of Experts (MoE) layer in Python with NumPy. Router, top-k gating, load balancing, and expert networks — with runnable code and...
Get the exact 10-course programming foundation that Data Science professionals use.