Agent Post-Training Playbook
12 篇 · Agent 后训练:RL/持续学习/自我改进 · 公式/代码静态渲染(零外部 CDN,国内直连)· 输入关键词过滤
📍 📍 学习路径 / Roadmap
建议从这里开始 · 按主题顺序刷 cheatsheet + drill
Cheatsheets 题解
7
Agent Evaluation / Agent 评测
Agent Foundations / Agent 基础
Agent Safety & Alignment / Agent 安全与对齐
Agentic & Long-horizon RL / 长程 Agent 强化学习
Agentic RL Infrastructure / Agentic RL 基础设施
Continual & Lifelong Learning / 持续与终身学习
Self-improving LLMs / 自我改进
Drills 手撕
4
Drill: EWC + Experience Replay from scratch
Drill: ReAct tool-call loop from scratch
Drill: Self-Refine Loop from scratch
Drill: Turn credit assignment from scratch