AI Doc
Training Techniques

Training Techniques

Frontier training recipes — reasoning, alignment, and efficient fine-tuning

Open-source reproducible methods that moved the frontier: GRPO-based reasoning RL (DeepSeek-R1), RLHF replacement via DPO, parameter-efficient adaptation via LoRA.