Invited journal extension of ISVC 2022 Best Paper. Per-class binomial thresholds scale to ImageNet, remote sensing, and long-tailed splits with stronger selective accuracy.
+1.3% coverageHi, I'm Nick 👋
I train AI systems to know when they don't know
Building AI systems that know their limits — teaching models when to abstain instead of guessing. Ph.D. candidate at Ohio State researching uncertainty-aware vision-language systems.
- Joined DCS Corp (AFRL) as Technical Analyst II leading LLM reject-option training and evaluation.
- Delivered AFRL LLM reject-option training with 8× utility on OOD tasks.
- Released MVA 2025 journal extension on reject-option calibration.
Recent work
A few projects I'm proud of — each with paper, code, and data.
Instruction-tuned LLMs equipped with abstention heads deliver 8× utility on out-of-distribution tasks.
8× OOD utilityContrastive evaluation shows SOTA multimodal MT models leverage pixels beyond a regularization effect.
+7% image-grounding scoreAbout
I’m finishing my PhD at Ohio State, where I focus on selective prediction and uncertainty quantification. My work addresses a fundamental question: how do we build ML systems that know when they don’t know something — and admit it gracefully?
Most ML systems are trained to always give an answer, even when they shouldn’t. They’ll confidently predict on out-of-distribution data, hallucinate facts, or make decisions they’re not equipped to handle. This gap between confidence and competence is particularly dangerous in production settings where bad predictions have real consequences. My research directly addresses a core challenge in AI alignment: building systems that are honest about their uncertainty rather than confidently hallucinating.
My research centers on building systems that can recognize their own uncertainty and abstain from predictions when appropriate. The core idea is simple: a model that says “I don’t know” and routes to a human is more reliable than one that guesses wrong with high confidence. This matters especially in domains like defense, medical imaging, and content moderation — anywhere the cost of a mistake outweighs the cost of asking for help.
The challenge isn’t just technical. It’s about changing how we think about ML systems — from tools that must always have an answer to tools that know their limits. That means developing better calibration methods, designing human-in-the-loop workflows that actually work in practice, and building evaluation frameworks that measure not just accuracy but reliability.
Day to day, I write Python and PyTorch, run experiments on HPC clusters, and build the plumbing that makes research usable — data pipelines, benchmarks, reproducible code.
Looking for research scientist or ML engineering roles in 2026. U.S. citizen, comfortable with CUI/DoD work.
Experience
Research & industryTechnical Analyst II — DCS Corp (sponsored by Air Force Research Laboratory)
Dayton, OH
- Train and evaluate instruction-tuned LLMs with reject-option heads for analyst workflows, improving out-of-distribution utility by **8×** over competing approaches.
- Build evaluation harnesses and calibration dashboards that connect LLM policies to existing command-and-control tooling.
Graduate Research Associate — Computer Vision Lab
Ohio State University · Columbus, OH
- Lead the lab’s uncertainty-aware multimodal modeling portfolio under Prof. Jim Davis.
- Designed imagery-aware contrastive metrics for **multimodal machine translation** (WMT 2024), showing that state-of-the-art models depend on visual evidence rather than treating images as regularizers.
- Developed binomial per-class **reject-option training** for ImageNet, remote sensing, and long-tailed datasets (ISVC 2022 Best Paper; MVA 2025 extension), improving selective accuracy of vision transformers by **+0.4%** and coverage by **+1.3%**.
- Integrated these methods into open-source toolkits and analyst-facing evaluation pipelines.
Graduate Teaching Associate — Machine Learning & NLP
Ohio State University · Columbus, OH
- Support **80+ students** per offering in machine learning, computer vision, and natural language processing courses.
- Run recitations, office hours, and targeted study plans, and maintain auto-graded labs (including introductory LLM labs) with an emphasis on calibration, safety, and responsible deployment.
Graduate Research Intern — Air Force Research Laboratory (U.S. CUI)
Dayton, OH
- Summer 2024: Adapted and trained **JEPA and MAE transformers** in a distributed Slurm/Singularity setup for multimodal EO/SAR representation learning, achieving superior low-data performance over supervised methods.
- Summer 2023: Developed **Reject Option Beam Search** to improve machine translation quality at large beam widths.
- Summer 2022: Pioneered an end-to-end training algorithm for Naturally Constrained Reject Option Classification.
Undergraduate Research Intern — Air Force Research Laboratory (U.S. CUI)
Dayton, OH
- Summer 2021: Devised an **ensemble distillation** method to improve model performance on ambiguous instances.
- Summer 2020: Constructed a semi-automated system for **temporal satellite imagery collection** (ICCV 2021 workshop), later released as the Construction-Site-Satellite-Imagery dataset.
Undergraduate Research Associate — Computer Vision Lab
Ohio State University · Columbus, OH
- Engineered semi-automatic labeling workflows for remote sensing change detection, creating Python tooling that bootstrapped datasets for uncertainty-aware modeling.
Summer Research Intern — Sii Canada / Concordia University
Montreal, QC
- Built anomaly detection dashboards that translated large-scale behavioral telemetry into prioritized experiments, highlighting early lessons on uncertainty estimation.
Undergraduate Teaching Associate — Discrete Structures & Algorithms
Ohio State University · Columbus, OH
- Mentored discrete structures and algorithms cohorts through recitations, office hours, and targeted study plans emphasizing analytical rigor.
Skills
ML & Modeling
Tools & Infrastructure
Domains
Service
Publications
All with runnable codeMachine Vision and Applications · 2025
Naturally Constrained Reject Option Classification
Invited journal extension of ISVC 2022 Best Paper. Per-class binomial thresholds scale to ImageNet, remote sensing, and long-tailed splits with stronger selective accuracy.
AFRL Technical Report · 2025
Selective LLM Training with Reject Options
Instruction-tuned LLMs equipped with abstention heads deliver 8× utility on out-of-distribution tasks.
WMT 2024 · 2024
Assessing the Role of Imagery in Multimodal Machine Translation
Contrastive evaluation shows SOTA multimodal MT models leverage pixels beyond a regularization effect.
ISVC 2022 · 2022
Learning When to Say "I Don't Know"
Binomial modeling of per-class reject thresholds that boost selective accuracy while keeping abstentions calibrated. Extended in MVA 2025 journal version.
ICCV 2021 Workshop on LUAI · 2021
A Framework for Semi-automatic Collection of Temporal Satellite Imagery for Analysis of Dynamic Regions
Semi-automated scraping plus OpenStreetMap cues to assemble temporal satellite datasets that feed downstream change-detection.
Open Source
Tools & datasetsEvaluation harness for multimodal MT, selective LLM routing, and visual-text calibration experiments.
Semi-automatic satellite data ingestion plus labeling UI for monitoring changing regions.
PyTorch toolkit for per-class reject-option training with binomial threshold search, dashboards, and CLI.