Arxiv Insights

Curated Research Insights • AI & Machine Learning

智能体系统4.6

AstroVLM: Expert Multi-agent Collaborative Reasoning for Astronomical Imaging Quality Diagnosis

Yaohui Han, Tianshuo Wang, Zixi Zhao, Zhengchun Zhu, Shuo Ren, Yiru Wang, Rongliang Fu, Tinghuan Chen, Tsung-Yi Ho
面向天文图像质量诊断的多智能体框架,通过按流程分工、专属知识检索和回溯式协同推理,把任务从“给分”推进到“定位错误来源”。
天文图像质量诊断多智能体协同推理检索增强生成知识图谱分割反向回溯推理
智能体系统4.9

Do LLMs Need to See Everything? A Benchmark and Study of Failures in LLM-driven Smartphone Automation using Screentext vs. Screenshots

Shiquan Zhang, Tianyi Zhang, Le Fang, Simon D'Alfonso, Hong Jia, Vassilis Kostakos
在 25 个 Android 应用的 75 个真实任务上系统比较 screentext 与 screenshot 输入,发现截图只带来小幅收益却显著抬高成本,而主要失败根源是 UI 可访问性与解析缺失。
移动智能体手机自动化屏幕文本截图模态失败分析
智能体系统4.0

Persona-Based Requirements Engineering for Explainable Multi-Agent Educational Systems: A Scenario Simulator for Clinical Reasoning Training

Weibing Zheng, Laurah Turner, Jess Kropczynski, Matthew Kelleher, Murat Ozer, Shane Halse
将AI Personas与XAI用户故事嵌入需求工程流程,把多智能体教育系统的可解释性要求从后期补丁前移到早期设计,并在临床推理训练模拟器中完成案例验证。
需求工程可解释人工智能多智能体教育系统临床推理训练人物角色
智能体系统4.7

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan, Asini Subanya, Boubacar Ballo, Kashish Satija, Mariam Shafey, Mohamed Mahmoud, Moncif Dahaji Bouffi, Pasindu Wickramasinghe, Siyona Goel, Yaakulya Sabbani, Hakim Hacid, Mthandazo Ndhlovu, Eleanna Kafeza, Sanjay Rawat, Muhammad Shafique
用RAG与多智能体流水线把脆弱代码自动整理成 Google Project Zero 风格的漏洞根因分析报告,并用 LLM Judge 评估其结构、事实与修复质量。
漏洞报告生成检索增强生成多智能体系统内存破坏分析LLM评估
智能体系统6.6

TacticGen: Grounding Adaptable and Scalable Generation of Football Tactics

Sheng Xu, Guiliang Liu, Tarak Kharrat, Yudong Luo, Mohamed Aloulou, Javier López Peña, Konstantin Sofeikov, Adam Reid, Paul Roberts, Steven Spencer, Joe Carnall, Ian McHale, Oliver Schulte, Hongyuan Zha, Wei-Shi Zheng
用多智能体扩散变换器把足球战术生成建模为可被规则、自然语言和价值函数共同引导的条件轨迹生成问题,并在大规模真实比赛数据上实现可扩展的战术设计。
足球战术生成多智能体扩散模型轨迹预测分类器引导专家评估
智能体系统4.7

Dynamics of Cognitive Heterogeneity: Investigating Behavioral Biases in Multi-Stage Supply Chains with LLM-Based Simulation

Jiuyun Jiang, Yuecheng Hong, Bo Yang, Jin Yang, Guangxin Jiang, Xiaomeng Guo, Guang Xiao
在Beer Distribution Game中用分层推理的DeepSeek/GPT代理系统考察认知异质性,发现单点高能力并不能消除牛鞭效应,信息共享才是稳定供需波动的关键。
大语言模型仿真啤酒分销游戏牛鞭效应认知异质性信息共享