FORGE: Form-Optimal Routing of Grounded Evidence for Frozen Language Agents

Xiao, Xi; Zhang, Yunbei; Liu, Chen; Zhao, Lin; Chen, Jialin; Zhao, Tianchen; Xu, Xiang; Krishnaswamy, Smita; Wang, Tianyang; Xu, Min

FORGE: Form-Optimal Routing of Grounded Evidence for Frozen Language Agents

FORGE learns when a frozen language agent should read raw documents, summaries, tables, key-value facts, or no external evidence at all, reducing cost while improving grounded reasoning.

Xi Xiao¹, Yunbei Zhang², Chen Liu³, Lin Zhao⁴, Jialin Chen³, Tianchen Zhao⁶, Xiang Xu⁶, Smita Krishnaswamy³, Tianyang Wang^1,†, Min Xu^5,†

¹University of Alabama at Birmingham · ²Tulane University · ³Yale University · ⁴Northeastern University · ⁵Carnegie Mellon University · ⁶Amazon AGI

†Co-corresponding authors.

Paper coming soon Code coming soon GitHub BibTeX

Overview of FORGE for routing grounded evidence forms to frozen language agents. — FORGE separates the question of what evidence to retrieve from the question of what form that evidence should take, routing each query to the most useful representation and reasoning budget.

Abstract

Frozen agents need evidence in the right form, not always more context.

Retrieval-augmented language agents often rely on fixed evidence formats, such as always passing raw documents or always passing summaries, even though different queries benefit from different forms and reasoning budgets.

FORGE is a compact factorized router that learns form-optimal evidence routing for frozen language agents. It enumerates candidate evidence forms, distills routing behavior, and refines decisions with reinforcement learning, improving accuracy while reducing token cost across benchmarks and backbones.

Three claims

What this paper changes

Evidence form is a control variable.

Raw passages, summaries, tables, key-value facts, and abstention are each optimal for different query conditions.

Routing can be compact.

A 269K-parameter factorized router makes per-query decisions without modifying the frozen language agent.

Cost and accuracy can improve together.

FORGE routes away from expensive default context when cheaper evidence forms are sufficient, while preserving or improving grounded performance.

Method

A staged router for evidence form and reasoning budget

Stage 0 enumeration

Candidate evidence forms and budgets are evaluated to reveal which route works best for each training query.

Stage 1 distillation

A compact router learns the enumerated oracle behavior through KL-style policy distillation.

Stage 2 refinement

GRPO-style optimization adjusts the router toward the downstream reward and token-cost tradeoff.

A staged router for evidence form and reasoning budget — FORGE trains a factorized policy over evidence form and reasoning budget, then attaches it to frozen LLM agents without changing their weights.

Routing Decision

route(query) = evidence form + reasoning budget for the frozen agent

Results

Lower token cost with stronger grounded reasoning

269K Parameters in the factorized router

-45% Token-cost reduction with Qwen3-8B while improving macro F1

+3.5 Stage 2 macro-F1 gain over the distilled Stage 1 router

-47% Cost reduction on Claude-4-Sonnet with thinking-style hosts

Policy distribution analysis for FORGE. — FORGE learns nontrivial route distributions instead of collapsing to a single evidence form across tasks.

Evidence support forms used by FORGE. — The router chooses among multiple support forms, allowing frozen agents to receive evidence in the representation best matched to the query.

Why it matters

Grounded agents should spend context only where it helps.

Long context is not automatically better grounding. It can add cost, distract the model, and obscure the compact evidence needed for a specific question.

FORGE treats evidence representation as a learned decision, making frozen agents more practical for settings where accuracy, latency, and token cost all matter.

Citation

Cite FORGE

If our work is helpful to your research, please consider citing FORGE. Thank you.

@misc{xiao2026forge,
  title        = {FORGE: Form-Optimal Routing of Grounded Evidence for Frozen Language Agents},
  author       = {Xi Xiao and Yunbei Zhang and Chen Liu and Lin Zhao and Jialin Chen and Tianchen Zhao and Xiang Xu and Smita Krishnaswamy and Tianyang Wang and Min Xu},
  year         = {2026},
  note         = {Project page: https://xixiaouab.github.io/FORGE/}
}