UCF  ·  SafeRR AI Lab
Home News Publications Teaching Links
Amrit Singh Bedi

Amrit Singh Bedi

Assistant Professor, Department of Computer Science, University of Central Florida
Adjunct Assistant Professor, Department of Computer Science, University of Maryland (2026–2029)

I am an Assistant Professor in the Department of Computer Science at the University of Central Florida, where I lead the SafeRR AI Lab. The modern generative AI development cycle has three stages: pre-training, post-training, and deployment. Our lab focuses on the latter two, targeting cutting-edge problems across LLMs, VLMs, vision-language-action models, diffusion LLMs, and agentic AI.

At the post-training stage, we develop novel fine-tuning and reinforcement learning methods that align AI systems with human values and safety constraints. At the deployment stage, we design inference-time algorithms that make models safer, more robust, and more reliable — without retraining. Our key research themes include:

  • AI Alignment — RLHF, preference learning, and reward modeling
  • Safety & Trustworthiness — inference-time safeguards, jailbreak defenses, and robustness in generative AI
  • Reinforcement Learning — multi-agent RL, hierarchical RL, and embodied intelligence
  • Optimization & Theory — bilevel, non-convex, and federated methods for modern ML

Prior to UCF, I held positions at the University of Maryland, College Park, and the U.S. Army Research Laboratory. I received my Ph.D. in Electrical Engineering from IIT Kanpur.

Selected News

Full news →
2026
Paper

Four papers accepted to ICML 2026.

Details coming soon!

2026
Paper

Two papers accepted to TMLR 2026

Details coming soon!

2026
Paper

Paper accepted to ACM FAccT 2026

Details coming soon!

2026
Paper

Two papers accepted to ICLR 2026

Two papers from our group accepted to ICLR 2026 in Rio de Janeiro, Brazil.

2026
Paper

Paper accepted to EACL 2026

LIAR: Leveraging Inference Time Alignment (Best-of-N) to Jailbreak LLMs in Seconds

2026
Paper

Paper on safety of VLMs accepted to AAAI 2026

2025
Paper

Paper accepted to Nature Biotechnology

Our work on generative AI and biosecurity.

2025
Paper

Three papers accepted to NeurIPS 2025

Work on reasoning models, theoretical RL, and LLM alignment.

2025
Workshop

Organizing the first NeurIPS Workshop on Biosecurity Safeguards for Generative AI

San Diego, California, USA.

2025
Service

Area Chair — ACL 2025, NeurIPS 2025, TMLR

2025
Talk

Invited talk at Asilomar 2025 on satisficing alignment

2025
Paper

Paper accepted to CVPR 2025

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

2024
Paper

Five papers accepted to ICML 2024, two to NeurIPS 2024

Including an ICML Spotlight (top 3.5%). Invited talks at IBM Research, Google DeepMind NYC, Amazon, and UT Austin.

Selected Publications

Full list
IEEE RAL2025

Learning Multi-Robot Coordination through Locality-Based Factorized Multi-Agent Actor-Critic Algorithm

C. L. Shek, A. S. Bedi, A. Basak, E. Novoseller, N. Waytowich, P. Narayanan, D. Manocha, and P. Tokekar

NeurIPS2025

Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models

S. S. Ghosal, S. Chakraborty, A. Reddy, Y. Lu, M. Wang, D. Manocha, F. Huang, M. Ghavamzadeh, and A. S. Bedi

ICML2025

Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time

El Hajj Chehade M. F., Ghosal S. S., Chakraborty S., Reddy A., Manocha D., Zhu H., and A. S. Bedi

CVPR2025

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

S. S. Ghosal, S. Chakraborty, V. Singh, T. Guan, M. Wang, A. Beirami, F. Huang, A. Velasquez, D. Manocha, and A. S. Bedi

NeurIPS2024

Transfer Q*: Principled Decoding for LLM Alignment

S. Chakraborty, S. Ghoshal, M. Yin, D. Manocha, M. Wang, A. S. Bedi, and F. Huang

ICML2024

MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences

S. Chakraborty, J. Qiu, H. Yuan, A. Koppel, F. Huang, D. Manocha, A. S. Bedi, and M. Wang

ICML2024Spotlight

Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization

M. Gaur, A. S. Bedi, D. Wang, and V. Aggarwal

ICLR2024

PARL: A Unified Framework for Policy Alignment in Reinforcement Learning

S. Chakraborty, A. S. Bedi, A. Koppel, D. Manocha, H. Wang, M. Wang, and F. Huang

Teaching

Courses taught at the University of Central Florida across graduate and undergraduate levels.   View teaching page →