Venkatesh Saligrama

Professor of Electrical and Computer Engineering, Boston University · Amazon Scholar

I study the science of AI evaluation and learning under constraints: how to measure AI systems, how to determine when those measurements are reliable, and how intelligent systems can learn, reason, and act with limited information, resources, and feedback.

Department of Electrical and Computer Engineering
Department of Computer Science (by courtesy)
8 St. Mary’s Street, Room 438, Boston University, MA 02215
srv@bu (add dot edu)

Venkatesh Saligrama

Research Question

How can intelligent systems learn, reason, and be reliably evaluated when information, resources, and ground truth are limited?

My lab develops mathematical frameworks, algorithms, and evaluation protocols for this setting. A recurring theme is that AI systems should not only make predictions; they should also decide what information to acquire, what computation to perform, when to communicate, when to verify, and when a conclusion should be trusted.

Current Focus

AI Evaluation, Auditing, and Ground Truth

As AI systems produce long-form, multimodal, and evidence-dependent outputs, evaluation itself becomes a scientific problem. We study how to build evaluations that are reliable, diagnostic, auditable, and capable of evolving as evidence accumulates.

Selected work:
DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality
Hearing Between the Lines: Unlocking the Reasoning Power of LLMs for Speech Evaluation

Counterfactual Audits for AI Judges

Aggregate agreement scores often hide why a judge failed. We develop counterfactual evaluation protocols that separate perception, reasoning, and preference-mapping failures, with current applications to speech, audio-language systems, and AI-as-judge pipelines.

Current direction:
diagnostic audits for AI judges, rubric-conditioned evaluation, and multimodal judge reliability.

Transformers and Algorithm Discovery

We investigate when trained neural architectures learn algorithms rather than only fit predictors. Recent work studies transformers and in-context learning by extracting explicit computational procedures from trained models.

Selected work:
Linear Transformers Implicitly Discover Unified Numerical Algorithms

Resource-Aware and Adaptive AI Systems

AI systems operate under compute, communication, latency, memory, and supervision constraints. We develop methods that adaptively allocate scarce resources while preserving reliability.

Selected work:
Federated Learning Based on Dynamic Regularization
Adaptive Neural Networks for Efficient Inference

Selected Earlier Contributions

Earlier work in the lab studied learning with limited supervision, representation geometry, bias in learned representations, anomaly detection, and graph-structured data. These projects remain part of the lab’s foundation, but the current emphasis is on AI evaluation, algorithmic understanding of neural systems, and resource-aware intelligence.

Themes

Ground truth as a process.
Complex AI evaluation requires evidence, adjudication, auditing, and revision.
Evaluation as instrumentation.
Scores should reveal not only whether systems fail, but where and why they fail.
Constraint-aware intelligence.
AI systems should allocate computation, communication, supervision, and verification under uncertainty.
Models as algorithmic systems.
Trained architectures can sometimes be understood by extracting the procedures they implement.