00:00:32 AI也会“偏科”?高手如何跳出舒适区
00:05:55 AI进化论:从“伸手党”到“高手的秘密”
00:10:42 AI的“体检”新思路:如何看穿一个模型的“小心思”?
00:15:17 一百万学生教会我们的事:简单,可能就是最优解
本期介绍的四篇论文:
[LG] RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization
[Tongyi Lab, Alibaba Group & Peking University]
https://arxiv.org/abs/2508.00222
---
[LG] MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning
[BAAI]
https://arxiv.org/abs/2508.00271
---
[LG] Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs
[Carnegie Mellon University (CMU)]
https://arxiv.org/abs/2508.00161
---
[LG] Learning to Optimize Feedback for One Million Students: Insights from Multi-Armed and Contextual Bandits in Large-Scale Online Tutoring
[Carnegie Mellon University (CMU) & CK-12 Foundation]
https://arxiv.org/abs/2508.00270