e-HAIL Event

Autonomous, Data-driven Discovery with AI and Its Potential in Biomedical Research and Medicine

Bodhisattwa MajumderSenior Research ScientistAllen Institute for AIRubén Lozano AguileraLead Technical Product ManagerAllen Institute for AI
WHERE:
Remote/Virtual
SHARE:

Zoom information will be sent to e-HAIL members.

The promise of autonomous scientific discovery (ASD) hinges not only on answering questions, but also on knowing which questions to ask. Most recent works in ASD explore the use of LLMs in goal-driven settings, relying on human-specified research questions to guide hypothesis generation. However, scientific discovery may be accelerated further by allowing the AI system to drive exploration by its own criteria.

Hosted by Geoffrey Siwo, this talk aims to address ongoing speculations about autonomous AI-driven discovery, using our new system AutoDiscovery [1], and to equally raise broader questions about AI-scientist interaction for long-horizon research tasks. In a data-driven discovery [2] setup, it uses Bayesian surprise to surface findings that challenge known wisdom and Monte Carlo Tree Search to navigate the hypothesis space efficiently.

AutoDiscovery is open source, built by the Allen Institute for AI (Ai2), and was launched last month. Since then, researchers have used it to generate over 20,000 hypotheses across oncology, climate science, marine ecology, entomology, cybersecurity, music cognition, social sciences, and other domains. In this session, we’ll cover three things: how AutoDiscovery works, what we’ve learned from early testing with researchers across these fields, and what this means for AI-driven research workflows. We encourage you to try it (publicly available at autodiscovery.allen.ai with free credits), explore it with your own research datasets, and bring your questions to enrich the discussion.

[1] AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise, NeurIPS 2025, Agarwal et al.

[2] Data-driven Discovery with Large Generative Models, ICML 2024, Majumder et al.

About the Speakers:

Bodhisattwa Majumder is a Senior Research Scientist at the Allen Institute for AI (Ai2). He leads research on automating data-driven discovery for Asta, Ai2’s agentic ecosystem for scientific research. Earlier, he received his PhD from UC San Diego. His research was recognized by an ICML outstanding paper, UC San Diego CSE Dissertation Award, the Adobe Research Fellowship, and the Qualcomm Innovation Fellowship. Majumder also co-authored a best-selling O’Reilly book on Practical Natural Language Processing that is published in 4 languages and internationally adapted across 10+ universities and 20+ organizations.

Rubén Lozano Aguilera is the Lead Technical Product Manager at the Allen Institute for AI (Ai2) for Asta, Ai2’s agentic ecosystem for scientific research. Before Ai2, he spent 15+ years in product management and tech strategy at Google, Amazon, Microsoft, and Samsung. He has trained teams on ethical AI at Amazon, Google, and MIT; mentored startups, nonprofits, and LGBTQ+ organizations; and chairs CMD-IT’s Industry Board. Rubén holds a B.S. in EE from Tec de Monterrey, an MBA from MIT Sloan, and is currently pursuing a master’s in AI Ethics and Society at the University of Cambridge.

Organizer

e-HAIL