AFRL Technical Report · 2025
Selective LLM Training with Reject Options
8× OOD utilityWe extend our selective prediction research to large language models by training an abstention head alongside the base instruction-tuned transformer. The approach injects curriculum-learned outlier prompts and policy gradients so that routing decisions stay calibrated, improving downstream analyst-facing utility by 8× on held-out OOD evaluations. The report shares the evaluation harness and discusses how to wire the policy into production command-and-control systems.
Highlights
- Reject-option heads act as a triage layer for routing prompts to humans or fallback chains.
- Curriculum mixes synthetic outliers with frontier eval suites to maintain calibrated abstentions.
Artifacts & reproduction
Evaluation harness for multimodal MT, selective LLM routing, and visual-text calibration experiments.
- Run imagery-aware contrastive probes against WMT-style checkpoints.
- Benchmark LLM reject-option heads on held-out OOD prompts.
- Log structured reports (HTML/Markdown) for rapid model comparisons.
- Used to benchmark LLM reject-option heads and multimodal MT models for AFRL and academic deployments.