00:00:30 AI的“错题本”:高手是如何炼成的?
00:03:50 AI的自我修养:如何像高手一样不断精进?
00:07:10 AI当学徒:如何让机器自己学会“青出于蓝”?
00:11:25 AI的“装傻”艺术:我们还能相信它吗?
本期介绍的四篇论文:
[CL] WarriorMath: Enhancing the Mathematical Ability of Large Language Models with a Defect-aware Framework
[Microsoft & Peking University]
https://arxiv.org/abs/2508.01245
---
[LG] Refine-n-Judge: Curating High-Quality Preference Chains for LLM-Fine-Tuning
[Meta Reality Labs]
https://arxiv.org/abs/2508.01543
---
[LG] CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search
[University of Washington & DeepReinforce Team]
https://arxiv.org/abs/2508.02091
---
[LG] LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring
[University College London]
https://arxiv.org/abs/2508.00943