AI Engineering Masterclass
Build 8 LLM applications and master the tech stacks behind LLMs
By the end of each module, you'll build & deploy an LLM application:
What You'll Master
This curriculum provides a deep dive into the essential techniques powering the end-to-end LLM lifecycle.
The Production-Grade Stacks throughout LLM Lifecycle
You will learn to navigate production-grade stacks, selecting the right tools and architectures tailored to your specific project needs.
By the end of the course, you will be able to implement these techniques and deploy LLM systems into production environments.

LLM lifecycle with key techniques and how major use cases implement these techniques (Created by Kuriko IWAI)
For Developers, Data Scientists, Builders
This course is designed for those who want to bridge the gap between AI Enthusiast and AI Engineer.
- Software Developers: Learn how to integrate LLMs into your existing tech stack without the fluff.
- Data Scientists: Move from static notebooks to functional, agentic applications.
- Self-Taught Builders: Get the diagrams and source code you need to turn your ideas into a portfolio-ready MVP.
The Full Curriculum
From LLM Foundations to Multi-Modal Agents
Module 1
The LLM Backbone: Building a RAG-Based GPT from Scratch
Explore the core mechanism and hands-on implementation of RAG, tokenizer, and inference logic.
PyTorchTensorHuggingFaceTransformersDecoder-only LLMCausal InferenceWARCStreamlituvYou'll Build: Website Summarizer with LLM Configuration Playground

Production Goals:
Implement custom BPE tokenizer, logits adjustment, and major decoding methods.
LLM Techniques to Master:
- Perform Common Crawl & Heuristic filtering.
- Build a BPE tokenizer to map text to tokens.
- Adjust logits via logits bias, temperature, and repetition penalty.
- Interactively apply stochastic/deterministic decoding methods.
- Deploy the inference via an API as a microservice.
Module 2
Build a Contextual RAG Engine with ColBERT & Notion
Bridge the gap between web browsing and your notes on Notion. Build a Chrome extension using ColBERT to generate personalized summaries that cross-reference your entire Notion workspace for deeper insight and zero-latency retrieval.
ColBERTPineconeRedisFastAPIYou'll Build: Notion Smart Clipper (Chrome Extension)

Production Goals:
Architect low-latency, multi-source retrieval pipelines.
LLM Techniques to Master:
- Late Interaction (ColBERT) for token-level retrieval.
- Semantic caching for realtime responses.
- Async data ingestion for real-time workspace syncing.
- OpenLLMetry for retrieval pipeline tracing.
Module 3
The AI Scientist: High-Precision Fitness Auditor
Audit workout logs against PubMed research. Build a system that prioritizes exact terminology and scientific citations.
BGE-RerankerWeaviatePydanticRAGASPytestYou'll Build: A Science-Backed Personal Weight Trainer

Production Goals:
Eliminate hallucinations in domain-specific tasks.
LLM Techniques to Master:
- Hybrid Retrieval Fusion (BM25 + Dense Vectors)
- Cross-encoder reranking to filter for noisy data
- Structured extraction with Pydantic for JSON outputs
- Self-correction loops via Corrective RAG (CRAG)
- CI/CD for automated accuracy regression tests
Module 4
The Digital Soul: Fine-Tuning Your Interactive Persona
Distill your chat history and writing style into a specialized 8B model for zero-latency, high-personality interactions.
On-device AIUnslothLlamavLLMDPOShareGPTYou'll Build: Your Digital Clone (Chatbot)

Production Goals:
Encoding unique personality traits into model weights.
Squeeze a massive model as on-device AI without needing a $30,000 GPU.
LLM Techniques to Master:
- Model distillation with GPT-4o as a teacher for Llama-3-8B
- Direct Preference Optimization (DPO) for style transfer
- Chat template engineering using Alpaca vs ShareGPT vs Llama3
- Model quantization using GGUF vs AWQ for 4-bit edge deployment
- vLLM serving for optimizing throughput for concurrent persona chats
Module 5
Module 5. Autonomous Executive Agent
Master the Action layer. Build a fault-tolerant agent that doesn’t just search, but executes multi-step tasks like bookings and scheduling.
LangGraphPEFTTavily APIPostgreSQLFastAPINext.jsYou'll Build: A Hotel Booking Assistant

Production Goals:
Implementing persistent agency that survives server crashes.
Managing high-stakes tool calling with human-in-the-loop.
LLM Techniques to Master:
- QLoRA fine-tuning for high-precision tool (function) calling.
- Parallel tool execution for complex multi-step tasks.
- Adapter-based training to eliminate JSON formatting hallucinations.
- Interrupt-driven workflows (Human-in-the-loop).
- Fault-tolerant retry logic for API failures.
Module 6
System-2 Thinking Vibe Coder
Implement inference-time scaling. Build a vibe coding partner that simulates debate and self-critique before finalizing a decision.
System-2 ThinkingLangGraphDeepSeek-R1OpenRouterPythonYou'll Build: Vibe Coding Assistant

Production Goals:
Maximize logical depth to solve complex problems with Tree-of-Thoughts.
LLM Techniques to Master:
- Tree-of-Thoughts (ToT) for exploring multiple reasoning branches
- Self-reflection & critique
- Sequential workflows from Planning -> Execution -> Verification
- Adversarial multi-agent debate to find consensus through conflict
- Inference-time compute using DeepSeek-R1 style reasoning
Module 7
The Smart Fridge: Visual Grounding & Multi-modal RAG
Identify ingredients from photos and generate real-time meal plans using multi-modal retrieval logic.
LlavaTensorRTPEFTChromaDBNext.jsYou'll Build: A Smart Fridge App

Production Goals:
Achieve accurate object detection and inventory matching in unconstrained physical environments.
LLM Techniques to Master:
- Fine-tuning the multi-modal projector for object alignment with Vision-LoRA:
- Multi-modal embeddings for searching by image similarity
- Optimizing multi-modal inference latency with TensorRT-LLM
- Measuring the accuracy of image-text alignment with CLIP score
- Video latent caching: Exploring frame-by-frame inventory tracking
Module 8
The AI Architect: Production Governance & Fiscal Control
Bridge the gap between functionality and security. Build a centralized governance layer to automate red - teaming, enforce compliance, and audit every token for maximum ROI.
LlamaGuardNeMo GuardrailsWeights & BiasesAWS/GCPYou'll Build: An Enterprise LLM Guardrails & Cost-Per-Query Dashboard

Production Goals:
Transit from prototypes to enterprise systems by integrating real-time guardrails and unit-economic tracking.
LLM Techniques to Master:
- LLM-as-a-Judge for automating regression testing and output validation.
- Automated PII redaction and data masking.
- Token ROI modeling for the cost-to-value analysis
- Prompt versioning with Git-based prompt management
- Red-teaming simulation over prompt injection and jailbreak attacks
✅ Bespoke Implementation
The Practical AI Engineer curriculum provides the full architectural blueprint to build and own your AI stack.
For those looking to bypass the setup phase and go straight to production, I provide high-touch engineering support to integrate these systems into your unique business environment.
Accelerate Your Deployment with Expert Engineering
If you require a direct path to a live MVP, I offer specialized boutique services:
1. 14-Day AI Implementation Sprint
I'll deploy a custom RAG or Agentic workflow (based on the course architectures) into your private cloud environment, tailored to your proprietary data.
2. Data Pipeline Engineering
Transform raw, unstructured data into high-fidelity assets for your LLMs.
3. Reliability & Security Audit
Audit cost, latency, reliability, and security of your AI pipelines.







