# Publications

<!--
Edit this file to maintain publication data. publication.html reads this Markdown directly in the browser.
Keep unpublished arXiv/preprint work in the Preprint section; year sections should contain published work only.
-->

## Preprint
### Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation.
- Authors: Qianhao Yuan, Jie Lou, Xing Yu, Hongyu Lin, Le Sun, Xianpei Han, Yaojie Lu.
- Venue: arXiv 2026
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arxiv.2605.18740)

### LiveFMBench: Unveiling the Power and Limits of Agentic Workflows in Specification Generation.
- Authors: Dong Xu, Jialun Cao, Guozhao Mo, Junjie Hu, Cheng Wen, Hongyu Lin, Xianpei Han, Shengchao Qin, Cong Tian, Shing-Chi Cheung, Le Sun and Yaojie Lu
- Venue: arXiv 2026
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arxiv.2605.01394)

### Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards.
- Authors: Mengjie Ren, Jie Lou, Boxi Cao, Xueru Wen, Hongyu Lin, Xianpei Han, Le Sun, Xing Yu and Yaojie Lu.
- Venue: arXiv 2026
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arxiv.2605.14539)

### Scalable Oversight for Superhuman AI via Recursive Self-Critiquing.
- Authors: Xueru Wen, Jie Lou, Xinyu Lu, Junjie Yang, Yanjiang Liu, Yaojie Lu, Debing Zhang and XingYu.
- Venue: arXiv 2025
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2502.04675)

### P^2O: Joint Policy and Prompt Optimization.
- Authors: Xinyu Lu, Kaiqi Zhang, Jinglin Yang, Boxi Cao, Yaojie Lu, Hongyu Lin, Min He, Xianpei Han and Le Sun.
- Venue: arXiv 2026
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2603.21877)

### Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces.
- Authors: Jiawei Chen, Ruoxi Xu, Boxi Cao, Ruotong Pan, Yunfei Zhang, Yifei Hu, Yong Du, Tingting Gao, Yaojie Lu, Yingfei Sun, Xianpei Han, Le Sun, Xiangyu Wu and Hongyu Lin.
- Venue: arXiv 2026
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2604.08362)

### Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models.
- Authors: Xinru Yan, Boxi Cao, Yaojie Lu, Hongyu Lin, Weixiang Zhou, Le Sun and Xianpei Han.
- Venue: arXiv 2026
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2604.16902)

### Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction.
- Authors: Tianyun Zhong, Guozhao Mo, Yanjiang Liu, Yihan Chen, Lingdi Kong, Xuanang Chen, Yaojie Lu, Hongyu Lin, Ben He and Le Sun.
- Venue: arXiv 2025
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2507.16271)

### Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window.
- Authors: Qiaoyu Tang, Hao Xiang, Le Yu, Bowen Yu, Yaojie Lu, Xianpei Han, Le Sun, WenJuan Zhang, Pengbo Wang, Shixuan Liu, Zhenru Zhang, Jianhong Tu, Hongyu Lin and Junyang Lin.
- Venue: arXiv 2025
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2510.08276)

### RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback.
- Authors: Qiaoyu Tang, Hao Xiang, Le Yu, Bowen Yu, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun and Junyang Lin.
- Venue: arXiv 2025
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2507.15024)

### Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models.
- Authors: Jiasheng Zheng, Boxi Cao, Zhengzhao Ma, Ruotong Pan, Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: arXiv 2024
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2407.11470)

### Multi-Facet Counterfactual Learning for Content Quality Evaluation.
- Authors: Jiasheng Zheng, Hongyu Lin, Boxi Cao, Meng Liao, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: arXiv 2024
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2410.07693)

### Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering.
- Authors: Xinyan Guan, Yanjiang Liu, Xinyu Lu, Boxi Cao, Ben He, Xianpei Han, Le Sun, Jie Lou, Bowen Yu, Yaojie Lu and Hongyu Lin.
- Venue: arXiv 2024
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2411.11504)

### Towards Scalable Automated Alignment of LLMs: A Survey.
- Authors: Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin and Bowen Yu.
- Venue: arXiv 2024
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2406.01252)

### URL: Universal Referential Knowledge Linking via Task-instructed Representation Compression.
- Authors: Zhuoqun Li, Hongyu Lin, Tianshu Wang, Boxi Cao, Yaojie Lu, Weixiang Zhou, Hao Wang, Zhenyu Zeng, Le Sun and Xianpei Han.
- Venue: arXiv 2024
- Year: preprint
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2404.16248)

## 2026

### Coupled Variational Reinforcement Learning for Language Model General Reasoning.
- Authors: Xueru Wen, Jie Lou, Yanjiang Liu, Hongyu Lin, Ben He, Xianpei Han, Le Sun, Yaojie Lu and Debing Zhang.
- Venue: ICML 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2512.12576)

### Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning.
- Authors: Zichao Li, Jie Lou, Fangchen Dong, Zhiyuan Fan, Mengjie Ren, Hongyu Lin, Xianpei Han, Debing Zhang, Le Sun, Yaojie Lu and Xing Yu.
- Venue: ICML 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2603.10535)

### MetaphorVU: Towards Metaphorical Video Understanding.
- Authors: Zhuoqun Li, Boxi Cao, Guiping Jiang, Fangrui Lv, Ruotong Pan, Jianan Wang, Xiangyu Wu, Hongyu Lin, Yaojie Lu, Yong Du, Ruyin Jia, Liyan, Tingting Gao, Han Li, Xianpei Han and Le Sun.
- Venue: ICML 2026
- Year: 2026
- Links:

### Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards.
- Authors: Zhengzhao Ma, Xueru Wen, Boxi Cao, Yaojie Lu, Hongyu Lin, Jinglin Yang, Min He, Xianpei Han and Le Sun.
- Venue: ICML 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2603.09117)

### Towards Multimodal Large Language Models with Both Training and Inference Efficiency.
- Authors: Qianhao Yuan, Yanjiang Liu, Guozhao Mo, Yaojie Lu, Hongyu Lin, Jia Zheng, Ben He, Xianpei Han and Le Sun.
- Venue: ICML 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2502.02458)

### Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models.
- Authors: Yanjiang Liu, Shuheng Zhou, Yaojie Lu, Huijia Zhu, Weiqiang Wang, Hongyu Lin, Ben He, Xianpei Han and Le Sun.
- Venue: ICLR 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2501.01830)

### DeepRAG: Thinking to Retrieve Step by Step for Large Language Models.
- Authors: Xinyan Guan, Jiali Zeng, Fandong Meng, Chunlei Xin, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun and Jie Zhou.
- Venue: ICLR 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2502.01142)

### All Languages Matter! Understanding Language Bias in Multilingual Evidence Reranking for RAG.
- Authors: Dan Wang, Guozhao Mo, Yafei Shi, Cheng Zhang, Bo Zheng, Boxi Cao, Xuanang Chen, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han and Le Sun.
- Venue: ACL 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2604.20199)

### PaperRegister: Boosting Flexible-grained Paper Search via Hierarchical Register Indexing.
- Authors: Zhuoqun Li, Xuanang Chen, Hongyu Lin, Yaojie Lu, Xianpei Han, Shanshan Jiang, Bin Dong and Le Sun.
- Venue: ACL 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2508.11116)

### ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models.
- Authors: Jiasheng Zheng, Xin Zheng, Boxi Cao, Pengbo Wang, Zhengzhao Ma, Qiming Zhu, Jiazhen Jiang, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: ACL 2026 Demo
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2604.27467)

### MemSearcher: Iterative Memory Integration for Search Agent via End-to-End Reinforcement Learning.
- Authors: Qianhao Yuan, Jie Lou, Zichao Li, Jiawei Chen, Yaojie Lu, Hongyu Lin, Le Sun, Debing Zhang and Xianpei Han.
- Venue: ACL Findings 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2511.02805)

### Knowing When to Quit: Diagnosing and Training LLMs to Abort Futile Reasoning.
- Authors: Xinyan Guan, Jiali Zeng, Chunlei Xin, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun and Fandong Meng.
- Venue: ACL Findings 2026
- Year: 2026
- Links:

### When Models Outthink Their Safety: Unveiling and Mitigating Self-Jailbreak in Large Reasoning Models.
- Authors: Yingzhi Mao, Chunkang Zhang, Junxiang Wang, Xinyan Guan, Boxi Cao, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: ACL Findings 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2510.21285)

### Across Programming Language Silos: A Study on Cross-Lingual Retrieval-Augmented Code Generation.
- Authors: Qiming Zhu, Jialun Cao, Xuanang Chen, Weili Zhang, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun and Shing-Chi Cheung.
- Venue: ACL Findings 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2506.03535)

### DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation.
- Authors: Hao Zheng, Guozhao Mo, Xinru Yan, Qianhao Yuan, Wenkai Zhang, Xuanang Chen, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: ACL Findings 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2602.22839)

### On the Plasticity of Delta Parameters in Post-Trained Models.
- Authors: Qiaoyu Tang, Le Yu, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: ACL Findings 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2410.13841)

### FlashAgents: Accelerating Multi-Agent LLM Systems via Streaming Prefill Overlap.
- Authors: Taosong Fang, Zhen Zheng, Zhengzhao Ma, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: MLSys 2026
- Year: 2026
- Links:

### AI-Salesman: Towards Reliable Large Language Model Driven Telemarketing.
- Authors: Qingyu Zhang, Chunlei Xin, Xuanang Chen, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun, Qing Ye, Qianlong Xie and Xingxing Wang.
- Venue: AAAI 2026
- Year: 2026
- Links:
  - [Paper](https://doi.org/10.1609/aaai.v40i41.40781)
  - [arXiv](https://doi.org/10.48550/arXiv.2511.12133)

### EmbedAgent: Benchmarking Large Language Models in Embedded System Development.
- Authors: Ruiyang Xu, Jialun Cao, Mingyuan Wu, Wenliang Zhong, Yaojie Lu, Ben He, Xianpei Han, Shing-Chi Cheung and Le Sun.
- Venue: ICSE 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2506.11003)

### LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
- Authors: Guozhao Mo, Wenliang Zhong, Jiawei Chen, Qianhao Yuan, Xuanang Chen, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han and Le Sun.
- Venue: KDD 2026
- Year: 2026
- Links:
  - [arXiv](https://doi.org/10.48550/arXiv.2508.01780)

### Answer First, Evidence Second? Uncovering Hidden Risks in Well-Structured AI Search Summaries.
- Authors: Jinman Li, Xuanang Chen, Ruoxi Xu, Hongyu Lin, Yaojie Lu, Zecheng Fan, Xianpei Han and Le Sun.
- Venue: SIGIR 2026
- Year: 2026
- Links:

### Navigating the Infinite Dynamic Web Space: Effective In-Context Exploration via Cognitive Multi-Agent Collaboration.
- Authors: Guozhao Mo, Yanjiang Liu, Yafei Shi, Jiawei Chen, Yang Li, Yaojie Lu, Hongyu Lin, Ben He, Le Sun, Bo Zheng and Xianpei Han.
- Venue: EACL 2026
- Year: 2026
- Links:
  - [Paper](https://aclanthology.org/2026.eacl-long.384/)

### Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models.
- Authors: Qiao Liang, Yanjiang Liu, Weixiang Zhou, Ben He, Yaojie Lu, Hongyu Lin, Jia Zheng, Xianpei Han, Le Sun and Yingfei Sun.
- Venue: EACL 2026
- Year: 2026
- Links:
  - [Paper](https://aclanthology.org/2026.eacl-long.146/)
  - [arXiv](https://doi.org/10.48550/arXiv.2503.18034)

## 2025

### The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models.
- Authors: Zichao Li, Xueru Wen, Jie Lou, Yuqiu Ji, Yaojie Lu, Xianpei Han, Debing Zhang and Le Sun.
- Venue: ICML 2025
- Year: 2025
- Links:
  - [Paper](https://proceedings.mlr.press/v267/li25cw.html)
  - [arXiv](https://doi.org/10.48550/arXiv.2503.03122)

### Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?.
- Authors: Xueru Wen, Jie Lou, Yaojie Lu, Hongyu Lin, XingYu, Xinyu Lu, Ben He, Xianpei Han, Debing Zhang and Le Sun.
- Venue: ICLR 2025
- Year: 2025
- Links:
  - [Paper](https://openreview.net/forum?id=Cnwz9jONi5)
  - [arXiv](https://doi.org/10.48550/arXiv.2410.05584)

### StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization.
- Authors: Zhuoqun Li, Xuanang Chen, Haiyang Yu, Hongyu Lin, Yaojie Lu, Qiaoyu Tang, Fei Huang, Xianpei Han, Le Sun and Yongbin Li.
- Venue: ICLR 2025
- Year: 2025
- Links:
  - [Paper](https://openreview.net/forum?id=GhexuBLxbO)
  - [arXiv](https://doi.org/10.48550/arXiv.2410.08815)

### The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model.
- Authors: Jiawei Chen, Wentao Chen, Jing Su, Jingjing Xu, Hongyu Lin, Mengjie Ren, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: ICLR 2025
- Year: 2025
- Links:
  - [Paper](https://openreview.net/forum?id=eznTVIM3bs)
  - [arXiv](https://doi.org/10.48550/arXiv.2412.07298)

### CRUXEVAL-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution.
- Authors: Ruiyang Xu, Jialun Cao, Yaojie Lu, Ming Wen, Hongyu Lin, Xianpei Han, Ben He, Shing-Chi Cheung and Le Sun.
- Venue: ACL 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.acl-long.1158/)
  - [arXiv](https://doi.org/10.48550/arXiv.2408.13001)

### Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch.
- Authors: Xueru Wen, Jie Lou, Zichao Li, Yaojie Lu, XingYu, Yuqiu Ji, Guohai Xu, Hongyu Lin, Ben He, Xianpei Han, Le Sun and Debing Zhang.
- Venue: ACL 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.acl-long.737/)
  - [arXiv](https://doi.org/10.48550/arXiv.2502.17173)

### DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking.
- Authors: Zhuoqun Li, Haiyang Yu, Xuanang Chen, Hongyu Lin, Yaojie Lu, Fei Huang, Xianpei Han, Yongbin Li and Le Sun.
- Venue: ACL 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.acl-long.220/)
  - [arXiv](https://doi.org/10.48550/arXiv.2502.20730)

### From Informal to Formal - Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs.
- Authors: Jialun Cao, Yaojie Lu, Meiziniu Li, Haoyang Ma, Haokun Li, Mengda He, Cheng Wen, Le Sun, Hongyu Zhang, Shengchao Qin, Shing-Chi Cheung and Cong Tian.
- Venue: ACL 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.acl-long.1310/)
  - [arXiv](https://doi.org/10.48550/arXiv.2501.16207)

### Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning.
- Authors: Ruoxi Xu, Yunjie Ji, Boxi Cao, Yaojie Lu, Hongyu Lin, Xianpei Han, Ben He, Yingfei Sun, Xiangang Li and Le Sun.
- Venue: ACL 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.acl-long.1392/)
  - [arXiv](https://doi.org/10.48550/arXiv.2504.00472)

### Sparse Latents Steer Retrieval-Augmented Generation.
- Authors: Chunlei Xin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Xuanang Chen, Xinyan Guan, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: ACL 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.acl-long.228/)

### Critic-CoT: Boosting the Reasoning Abilities of Large Language Model via Chain-of-Thought Critic.
- Authors: Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang and Le Sun.
- Venue: ACL Findings 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.findings-acl.89/)
  - [arXiv](https://doi.org/10.48550/arXiv.2408.16326)

### On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation.
- Authors: Xueru Wen, Jie Lou, Xinyu Lu, Yuqiu Ji, Xinyan Guan, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Debing Zhang and Le Sun.
- Venue: ACL Findings 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.findings-acl.271/)
  - [arXiv](https://doi.org/10.48550/arXiv.2406.12221)

### READoc: A Unified Benchmark for Realistic Document Structured Extraction.
- Authors: Zichao Li, Aizier Abulaiti, Yaojie Lu, Xuanang Chen, Jia Zheng, Hongyu Lin, Xianpei Han, Shanshan Jiang, Bin Dong and Le Sun.
- Venue: ACL Findings 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.findings-acl.1128/)
  - [arXiv](https://doi.org/10.48550/arXiv.2409.05137)

### Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models.
- Authors: Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Ben He, Le Sun, Jingren Zhou and Junyang Lin.
- Venue: ACL Findings 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.findings-acl.473/)

### ShortGPT: Layers in Large Language Models are More Redundant Than You Expect.
- Authors: Xin Men, Mingyu Xu, Qingyu Zhang, Qianhao Yuan, Bingning Wang, Hongyu Lin, Yaojie Lu, Xianpei Han and Weipeng Chen.
- Venue: ACL Findings 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.findings-acl.1035/)
  - [arXiv](https://doi.org/10.48550/arXiv.2403.03853)

### The Linguistic Connectivities Within Large Language Models.
- Authors: Dan Wang, Boxi Cao, Ning Bian, Xuanang Chen, Yaojie Lu, Hongyu Lin, Jia Zheng, Le Sun, Shanshan Jiang, Bin Dong and Xianpei Han.
- Venue: ACL Findings 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.findings-acl.456/)

### AutoAlign: Get Your LLM Aligned with Minimal Annotations.
- Authors: Xinyu Lu, Dong Xu, Chunkang Zhang, Xinyan Guan, Junxiang Wang, Qingyu Zhang, Pengbo Wang, Yingzhi Mao, Hao Xiang, Xueru Wen, Zichao Li, Yaojie Lu, Hongyu Lin, Le Sun and Xianpei Han.
- Venue: ACL 2025 Demo
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.18653/v1/2025.acl-demo.19)

### ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers.
- Authors: Qianhao Yuan, Qingyu Zhang, Yanjiang Liu, Jiawei Chen, Yaojie Lu, Hongyu Lin, Jia Zheng, Xianpei Han and Le Sun.
- Venue: ICCV 2025
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.1109/ICCV51701.2025.00038)
  - [arXiv](https://doi.org/10.48550/arXiv.2504.00502)

### DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation.
- Authors: Qiming Zhu, Jialun Cao, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun and Shing-Chi Cheung.
- Venue: AAAI 2025
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.1609/aaai.v39i24.34811)
  - [arXiv](https://doi.org/10.48550/arXiv.2408.13204)

### ConsistentChat: Building Skeleton-Guided Consistent Multi-Turn Dialogues for Large Language Models from Scratch.
- Authors: Jiawei Chen, Xinyan Guan, Qianhao Yuan, Guozhao Mo, Weixiang Zhou, Yaojie Lu, Hongyu Lin, Ben He, Le Sun and Xianpei Han.
- Venue: EMNLP 2025
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.18653/v1/2025.emnlp-main.424)
  - [arXiv](https://doi.org/10.48550/arXiv.2506.03558)

### PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides.
- Authors: Hao Zheng, Xinyan Guan, Hao Kong, Wenkai Zhang, Jia Zheng, Weixiang Zhou, Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: EMNLP 2025
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.18653/v1/2025.emnlp-main.728)
  - [arXiv](https://doi.org/10.48550/arXiv.2501.03936)

### RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing.
- Authors: Hao Xiang, Tianyi Tang, Yang Su, Bowen Yu, An Yang, Fei Huang, Yichang Zhang, Yaojie Lu, Hongyu Lin, Xianpei Han, Jingren Zhou, Junyang Lin and Le Sun.
- Venue: EMNLP 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.findings-emnlp.730/)
  - [arXiv](https://doi.org/10.48550/arXiv.2507.20352)

### Teach Small Models to Reason by Curriculum Distillation.
- Authors: Wangyi Jiang, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: EMNLP 2025
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.18653/v1/2025.emnlp-main.376)

### Transferable Post-training via Inverse Value Learning.
- Authors: Xinyu Lu, Xueru Wen, Yaojie Lu, Bowen Yu, Hongyu Lin, Haiyang Yu, Le Sun, Xianpei Han and Yongbin Li.
- Venue: NAACL 2025
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.18653/v1/2025.naacl-long.227)
  - [arXiv](https://doi.org/10.48550/arXiv.2410.21027)

### Aligning Retrieval with Reader Needs: Reader-Centered Passage Selection for Open-Domain Question Answering.
- Authors: Chunlei Xin, Shuheng Zhou, Xuanang Chen, Yaojie Lu, Huijia Zhu, Weiqiang Wang, Zhongyi Liu, Xianpei Han and Le Sun.
- Venue: COLING 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.coling-main.67/)

### Improved Sparse Upcycling for Instruction Tuning.
- Authors: Wangyi Jiang, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: COLING 2025
- Year: 2025
- Links:
  - [Paper](https://aclanthology.org/2025.coling-main.636/)

### Influence of External Information on Large Language Models Mirrors Social Cognitive Patterns.
- Authors: Ning Bian, Hongyu Lin, Peilin Liu, Yaojie Lu, Chunkang Zhang, Ben He, Xianpei Han and Le Sun.
- Venue: IEEE Trans. Comput. Soc. Syst.
- Year: 2025
- Links:
  - [Paper](https://doi.org/10.1109/TCSS.2024.3476030)

## 2024

### Self-Retrieval: End-to-End Information Retrieval with One Large Language Model.
- Authors: Qiaoyu Tang, Jiawei Chen, Zhuoqun Li, Bowen Yu, Yaojie Lu, Cheng Fu, Haiyang Yu, Hongyu Lin, Fei Huang, Ben He, Xianpei Han, Le Sun and Yongbin Li.
- Venue: NeurIPS 2024
- Year: 2024
- Links:
  - [Paper](http://papers.nips.cc/paper_files/paper/2024/hash/741ad162ab0f3da6f9aad60e9e34f5f1-Abstract-Conference.html)
  - [arXiv](https://doi.org/10.48550/arXiv.2403.00801)

### Open Grounded Planning: Challenges and Benchmark Construction.
- Authors: Shiguang Guo, Ziliang Deng, Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: ACL 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.acl-long.272)
  - [arXiv](https://doi.org/10.48550/arXiv.2406.02903)

### Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?.
- Authors: Ning Bian, Xianpei Han, Hongyu Lin, Yaojie Lu, Ben He and Le Sun.
- Venue: ACL 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.acl-long.221)
  - [arXiv](https://doi.org/10.48550/arXiv.2402.14355)

### Debiasing In-Context Learning by Instructing LLMs How to Follow Demonstrations.
- Authors: Lvxue Li, Jiaqi Chen, Xinyu Lu, Yaojie Lu, Hongyu Lin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Zhongyi Liu, Xianpei Han and Le Sun.
- Venue: ACL Findings 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.findings-acl.430)

### REInstruct: Building Instruction Data from Unlabeled Corpus.
- Authors: Shu Chen, Xinyan Guan, Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: ACL Findings 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.findings-acl.408)
  - [arXiv](https://doi.org/10.48550/arXiv.2408.10663)

### SoFA: Shielded On-the-fly Alignment via Priority Rule Following.
- Authors: Xinyu Lu, Bowen Yu, Yaojie Lu, Hongyu Lin, Haiyang Yu, Le Sun, Xianpei Han and Yongbin Li.
- Venue: ACL Findings 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.findings-acl.424)
  - [arXiv](https://doi.org/10.48550/arXiv.2402.17358)

### XMC-Agent : Dynamic Navigation over Scalable Hierarchical Index for Incremental Extreme Multi-label Classification.
- Authors: Yanjiang Liu, Tianyun Zhong, Yaojie Lu, Hongyu Lin, Ben He, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Zhongyi Liu, Xianpei Han and Le Sun.
- Venue: ACL Findings 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.findings-acl.336)

### Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting.
- Authors: Xinyan Guan, Yanjiang Liu, Hongyu Lin, Yaojie Lu, Ben He, Xianpei Han and Le Sun.
- Venue: AAAI 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.1609/aaai.v38i16.29770)
  - [arXiv](https://doi.org/10.48550/arXiv.2311.13314)

### Chain-of-Rewrite: Aligning Question and Documents for Open-Domain Question Answering.
- Authors: Chunlei Xin, Yaojie Lu, Hongyu Lin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Zhongyi Liu, Xianpei Han and Le Sun.
- Venue: EMNLP 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.findings-emnlp.104)

### Seg2Act: Global Context-aware Action Generation for Document Logical Structuring.
- Authors: Zichao Li, Shaojie He, Meng Liao, Xuanang Chen, Yaojie Lu, Hongyu Lin, Yanxiong Lu, Xianpei Han and Le Sun.
- Venue: EMNLP 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.18653/v1/2024.emnlp-main.1003)
  - [arXiv](https://doi.org/10.48550/arXiv.2410.06802)

### Beyond Full Fine-tuning: Harnessing the Power of LoRA for Multi-Task Instruction Tuning.
- Authors: Chunlei Xin, Yaojie Lu, Hongyu Lin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Zhongyi Liu, Xianpei Han and Le Sun.
- Venue: COLING 2024
- Year: 2024
- Links:
  - [Paper](https://aclanthology.org/2024.lrec-main.206)

### ChatGPT Is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models.
- Authors: Ning Bian, Xianpei Han, Le Sun, Hongyu Lin, Yaojie Lu, Ben He, Shanshan Jiang and Bin Dong.
- Venue: COLING 2024
- Year: 2024
- Links:
  - [Paper](https://aclanthology.org/2024.lrec-main.276)
  - [arXiv](https://doi.org/10.48550/arXiv.2303.16421)

### Executing Natural Language-Described Algorithms with Large Language Models: An Investigation.
- Authors: Xin Zheng, Qiming Zhu, Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: COLING 2024
- Year: 2024
- Links:
  - [Paper](https://aclanthology.org/2024.lrec-main.596)
  - [arXiv](https://doi.org/10.48550/arXiv.2403.00795)

### Few-shot Named Entity Recognition via Superposition Concept Discrimination.
- Authors: Jiawei Chen, Hongyu Lin, Xianpei Han, Yaojie Lu, Shanshan Jiang, Bin Dong and Le Sun.
- Venue: COLING 2024
- Year: 2024
- Links:
  - [Paper](https://aclanthology.org/2024.lrec-main.634)
  - [arXiv](https://doi.org/10.48550/arXiv.2403.16463)

### Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge in Datasets and Large Language Models.
- Authors: Zhuoqun Li, Hongyu Lin, Yaojie Lu, Hao Xiang, Xianpei Han and Le Sun.
- Venue: COLING 2024
- Year: 2024
- Links:
  - [Paper](https://aclanthology.org/2024.lrec-main.980)
  - [arXiv](https://doi.org/10.48550/arXiv.2403.09750)

### Pattern Shifting or Knowledge Losing? A Forgetting Perspective for Understanding the Effect of Instruction Fine-Tuning.
- Authors: Chunkang Zhang, Boxi Cao, Yaojie Lu, Hongyu Lin, Liu Cao, Ke Zeng, Guanglu Wan, Xunliang Cai, Xianpei Han and Le Sun.
- Venue: CCL 2024
- Year: 2024
- Links:
  - [Paper](https://doi.org/10.1007/978-981-97-8367-0_32)

## 2023

### Learning In-context Learning for Named Entity Recognition.
- Authors: Jiawei Chen, Yaojie Lu, Hongyu Lin, Jie Lou, Wei Jia, Dai Dai, Hua Wu, Boxi Cao, Xianpei Han and Le Sun.
- Venue: ACL 2023
- Year: 2023
- Links:
  - [Paper](https://doi.org/10.18653/v1/2023.acl-long.764)
  - [arXiv](https://doi.org/10.48550/arXiv.2305.11038)

### Universal Information Extraction as Unified Semantic Matching.
- Authors: Jie Lou, Yaojie Lu, Dai Dai, Wei Jia, Hongyu Lin, Xianpei Han, Le Sun and Hua Wu.
- Venue: AAAI 2023
- Year: 2023
- Links:
  - [Paper](https://doi.org/10.1609/aaai.v37i11.26563)
  - [arXiv](https://doi.org/10.48550/arXiv.2301.03282)

### Testing Coreference Resolution Systems without Labeled Test Sets.
- Authors: Jialun Cao, Yaojie Lu, Ming Wen and Shing-Chi Cheung.
- Venue: ESEC/SIGSOFT FSE
- Year: 2023
- Links:
  - [Paper](https://doi.org/10.1145/3611643.3616258)

### Document Information Extraction via Global Tagging.
- Authors: Shaojie He, Tianshu Wang, Yaojie Lu, Hongyu Lin, Xianpei Han, Yingfei Sun and Le Sun.
- Venue: CCL 2023
- Year: 2023
- Links:
  - [Paper](https://doi.org/10.1007/978-981-99-6207-5_9)

### Harvesting Event Schemas from Large Language Models.
- Authors: Jialong Tang, Hongyu Lin, Zhuoqun Li, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: CCKS 2023
- Year: 2023
- Links:
  - [Paper](https://link.springer.com/chapter/10.1007/978-981-99-7224-1_5)
  - [arXiv](https://doi.org/10.48550/arXiv.2305.07280)

## Pre-LLM

### Unified Structure Generation for Universal Information Extraction.
- Authors: Yaojie Lu, Qing Liu, Dai Dai, Xinyan Xiao, Hongyu Lin, Xianpei Han, Le Sun and Hua Wu.
- Venue: ACL 2022
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/2022.acl-long.395)
  - [arXiv](https://doi.org/10.48550/arXiv.2203.12277)

### Procedural Text Understanding via Scene-Wise Evolution.
- Authors: Jialong Tang, Hongyu Lin, Meng Liao, Yaojie Lu, Xianpei Han, Le Sun, Weijian Xie and Jin Xu.
- Venue: AAAI 2022
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.1609/aaai.v36i10.21388)
  - [arXiv](https://doi.org/10.48550/arXiv.2203.07600)

### End-to-end neural event coreference resolution.
- Authors: Yaojie Lu, Hongyu Lin, Jialong Tang, Xianpei Han and Le Sun.
- Venue: Artif. Intell.
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.1016/j.artint.2021.103632)

### ISCAS at SemEval-2022 Task 10: An Extraction-Validation Pipeline for Structured Sentiment Analysis.
- Authors: Xinyu Lu, Mengjie Ren, Yaojie Lu and Hongyu Lin.
- Venue: SemEval 2022
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/2022.semeval-1.182)

### Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction.
- Authors: Yaojie Lu, Hongyu Lin, Jin Xu, Xianpei Han, Jialong Tang, Annan Li, Le Sun, Meng Liao and Shaoyi Chen.
- Venue: ACL 2021
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/2021.acl-long.217)
  - [arXiv](https://arxiv.org/abs/2106.09232)

### From Discourse to Narrative: Knowledge Projection for Event Relation Extraction.
- Authors: Jialong Tang, Hongyu Lin, Meng Liao, Yaojie Lu, Xianpei Han, Le Sun, Weijian Xie and Jin Xu.
- Venue: ACL 2021
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/2021.acl-long.60)
  - [arXiv](https://arxiv.org/abs/2106.08629)

### A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land?
- Authors: Hongyu Lin, Yaojie Lu, Jialong Tang, Xianpei Han, Le Sun, Zhicheng Wei and Nicholas Jing Yuan.
- Venue: EMNLP 2020
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/2020.emnlp-main.592)

### Syntactic and Semantic-driven Learning for Open Information Extraction.
- Authors: Jialong Tang, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun, Xinyan Xiao and Hua Wu.
- Venue: EMNLP Findings 2020
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/2020.findings-emnlp.69)
  - [arXiv](https://arxiv.org/abs/2103.03448)

### ISCAS at SemEval-2020 Task 5: Pre-trained Transformers for Counterfactual Statement Modeling.
- Authors: Yaojie Lu, Annan Li, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: SemEval 2020
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/2020.semeval-1.85)
  - [arXiv](https://arxiv.org/abs/2009.08171)

### Cost-sensitive Regularization for Label Confusion-aware Event Detection.
- Authors: Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: ACL 2019
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/p19-1521)
  - [arXiv](http://arxiv.org/abs/1906.06003)

### Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning.
- Authors: Yaojie Lu, Hongyu Lin, Xianpei Han and Le Sun.
- Venue: ACL 2019
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/p19-1429)

### Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks.
- Authors: Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: ACL 2019
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/p19-1511)
  - [arXiv](http://arxiv.org/abs/1906.03783)

### Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition.
- Authors: Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun, Bin Dong and Shanshan Jiang.
- Venue: EMNLP 2019
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/D19-1646)

### Iterative Dual Domain Adaptation for Neural Machine Translation.
- Authors: Jiali Zeng, Yang Liu, Jinsong Su, Yubin Ge, Yaojie Lu, Yongjing Yin and Jiebo Luo.
- Venue: EMNLP 2019
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/D19-1078)
  - [arXiv](http://arxiv.org/abs/1912.07239)

### Adaptive Scaling for Sparse Detection in Information Extraction.
- Authors: Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: ACL 2018
- Year: pre-llm
- Links:
  - [Paper](https://aclanthology.org/P18-1095/)
  - [arXiv](http://arxiv.org/abs/1805.00250)

### Nugget Proposal Networks for Chinese Event Detection.
- Authors: Hongyu Lin, Yaojie Lu, Xianpei Han and Le Sun.
- Venue: ACL 2018
- Year: pre-llm
- Links:
  - [Paper](https://aclanthology.org/P18-1145/)
  - [arXiv](http://arxiv.org/abs/1805.00249)

### Variational Recurrent Neural Machine Translation.
- Authors: Jinsong Su, Shan Wu, Deyi Xiong, Yaojie Lu, Xianpei Han and Biao Zhang.
- Venue: AAAI 2018
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.1609/aaai.v32i1.11985)
  - [arXiv](http://arxiv.org/abs/1801.05119)

### Exploring Implicit Semantic Constraints for Bilingual Word Embeddings.
- Authors: Jinsong Su, Zhenqiao Song, Yaojie Lu, Mu Xu, Changxing Wu and Yidong Chen.
- Venue: Neural Process. Lett.
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.1007/s11063-017-9762-8)

### Cross-lingual implicit discourse relation recognition with co-training.
- Authors: Yaojie Lu, Mu Xu, Changxing Wu, Deyi Xiong, Hongji Wang and Jinsong Su.
- Venue: Frontiers Inf. Technol. Electron. Eng.
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.1631/FITEE.1601865)

### Linguistic Perturbation Based Data Augmentation for Event Detection.
- Authors: Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun.
- Venue: Journal of Chinese Information Processing
- Year: pre-llm

### A Word Embedding Transfer Model for Robust Text Categorization.
- Authors: Yiming Zhang, Jing Wang, Weijian Deng and Yaojie Lu.
- Venue: CCL 2018
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.1007/978-3-030-01716-3_26)

### ISCAS_Sogou at TAC-KBP 2017.
- Authors: Xianpei Han, Xiliang Song, Hongyu Lin, Qichen Zhu, Yaojie Lu, Le Sun, Jingfang Xu, Mingrong Liu, Ranxu Su, Sheng Shang, Chenwei Ran and Feifei Xue.
- Venue: TAC 2017
- Year: pre-llm
- Links:
  - [Paper](https://tac.nist.gov/publications/2017/participant.papers/TAC2017.ISCAS_Sogou.proceedings.pdf)

### Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition.
- Authors: Biao Zhang, Jinsong Su, Deyi Xiong, Yaojie Lu, Hong Duan and Junfeng Yao.
- Venue: EMNLP 2015
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.18653/v1/d15-1266)

### Semantically Smooth Bilingual Phrase Embeddings Based on Recursive Autoencoders.
- Authors: Qian Lin, Jing Yang, Xiangwen Zhang, Hongji Wang, Yaojie Lu and Jinsong Su.
- Venue: Neural Process. Lett.
- Year: pre-llm
- Links:
  - [Paper](https://doi.org/10.1007/s11063-020-10210-1)