Registration: Please refer to NAACL 2025 registration website
Just as coding assistants have dramatically increased productivity for coding tasks over the last two years, researchers in the NLP community have begun to explore methods and opportunities ahead for creating scientific assistants that can help with the process of scientific discovery and increase the pace at which novel discoveries are made.
Historically, major results in AI and scientific discovery have been restricted to problem-specific methods—such as DeepMind AlphaFold and RosettaFold, which are systems specifically designed for protein folding, or multistage discovery pipelines designed for use in identification of novel materials. Over the last year, language models are being used to create problem-general scientific discovery assistants that are not restricted to narrow problem domains or formulations. Such applications hold opportunity for assisting researchers in broad domains, or scientific reasoning more generally. Beyond providing assistance, a growing body of work has begun to focus on the prospect of creating largely autonomous scientific discovery agents that can make novel discoveries with minimal human intervention.
We have also observed rising interest in evaluating systems that help with or perform scientific discovery. Difficulties with evaluation persist; as novel scientific discoveries (by definition) haven't been made yet, it is hard to formulate and evaluate a benchmark that provides insights about valuable machine-generated contributions that boost the discovery process. The community has begun to answer this with benchmarks focused on facets of the discovery process, such as data-driven discovery, experiment replication, reviewing, idea generation, or proxy tasks such as end-to-end discovery in virtual environments, all of which begin to address the challenges of evaluating progress towards generating novel discoveries.
These recent developments highlight the possibility of rapidly accelerating the pace of scientific discovery in the near term. Given the influx of researchers into this expanding field, this workshop proposes to serve as a vehicle for bringing together a diverse set of perspectives from this quickly expanding subfield, helping to disseminate the latest results, standardize evaluation, foster collaboration between groups, and allow discussing aspirational goals for 2025 and beyond.
Speakers
Call for Papers
We welcome submissions on all topics related to AI and Scientific Discovery including but not limited to:
- Literature-based Discovery
- Agent-centered Approaches
- Automated Experiment Execution
- Automated Replication
- Data-driven Discovery
- Discovery in Virtual Environments
- Discovery with Humans in the Loop
- Assistants for Scientific Writing
Organizers

Chief Scientific Officer (Microsoft)

Director of Semantic Scholar(Ai2)

Asst. Professor (HUJI)
Research Scientist (Ai2)

Lead Research Scientist (Ai2)

Lead Research Scientist (Ai2)

Research Scientist (Ai2)

Stony Brook University

Assoc. Prof. (University of Arizona)
Submission Guidelines
We welcome three types of papers: archival workshop papers, non-archival papers, and non-archival cross-submissions. Only regular workshop papers will be included in the workshop proceedings. Regular workshop submissions (both archival and non-archival) should be in PDF format and made through the OpenReview website set up for this workshop link. In line with the ACL main conference policy, camera-ready versions of regular workshop papers will be given one additional page of content. Non-archival cross-submissions should be made through the form [link].
- Archival regular workshop papers: Authors should submit a paper up to 8 pages (both short and long papers are welcome), with unlimited pages for references, following the ACL author guidelines. The reported research should be substantially original. All submissions will be reviewed in a single track, regardless of length. Accepted papers will be presented as posters by default, and best papers may be given the opportunity for a brief talk to introduce their work. Reviewing will be double-blind, and thus no author information should be included in the papers; self-reference that identifies the authors should be avoided or anonymised. Accepted papers will appear in the workshop proceedings. Preference for oral presentation slots in the workshop will be given to archival papers.
- Non-archival regular workshop papers: This is the same as the option above, but these papers will not appear in the proceedings and will typically only receive poster presentation slots. Non-archival submissions in this category will still undergo the review process. This is appropriate for nearly finished work that is intended for submission to another venue at a later date.
- Non-archival cross-submissions: We also solicit cross-submissions, i.e., papers on relevant topics that have already appeared in other venues (e.g., workshop or conference papers at NLP, ML, or cognitive science venues, among others). Accepted papers will be presented at the workshop, with an indication of original venue, but will not be included in the workshop proceedings. Cross-submissions are ideal for related work which would benefit from exposure to the audience working on Scientific Discovery. Papers in this category do not need to follow the ACL format, and the submission length is determined by the original venue. The paper selection will be solely determined by the organizing committee in a non-blind fashion. These papers will typically receive poster presentation slots.
In addition, we welcome papers on relevant topics that are under review or to be submitted to other venues (including the ACL 2024 main conference). These papers must follow the regular workshop paper format and will not be included in the workshop proceedings. Papers in this category will be reviewed by workshop reviewers.
Note to authors: For archival and non-archival regular workshop submissions, while you submit your paper through OpenReview (link), please select the "Track" properly based on the guidelines. For cross-submissions, please fill out this form ([link]) and do NOT submit through OpenReview.
For questions about the submission guidelines, please contact workshop organizers via aisd-organizers@googlegroups.com.
Important Dates
Paper Submission Deadline | Feb 6, 2025 (All deadlines are 11:59 PM AoE time.) |
Decision Notifications | Feb 27, 2025 |
Camera Ready Paper Deadline | Mar 10, 2025 |
Workshop Date | May 3, 2025 |
Schedule (Tentative)
08:55 AM | Opening Remarks |
---|---|
09:00 AM |
Keynote Talk 1: Marinka Zitnik TBD [Slides] [Abstract] [Speaker Bio]
Abstract: TBD
Bio: TBD
|
09:45 AM |
Keynote Talk 2: Kexin Huang AI Agents for Accelerating Scientific Discovery: From Hypothesis Generation to Experimental Design [Slides] [Abstract] [Speaker Bio]
Abstract: TBD.
Bio: TBD.
|
10:30 AM | Break 1 |
11:00 AM |
Keynote Talk 3: Heng Ji AI Plays Medicinal Chemist and Material Scientist [Abstract] [Speaker Bio]
Abstract: TBD
Bio: TBD
|
11:45 AM |
Oral Presentation: 1 LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study [Abstract]
Abstract: TBD
|
12:00 PM |
Oral Presentation: 2 Scideator: Iterative Human-LLM Scientific Idea Generation and Novelty Evaluation Grounded in Research-Paper Facet Recombination [Abstract]
Abstract: TBD
|
12:15 PM |
Oral Presentation: 3 What Can Large Language Models Do for Sustainable Food? [Abstract]
Abstract: TBD
|
12:30 PM | Lunch Break |
14:00 PM |
Keynote Talk 4: Peter Clark Towards (Semi-)Autonomous Scientific Discovery [Slides] [Abstract] [Speaker Bio]
Abstract: TBD
Bio: TBD
|
14:45 PM |
Oral Presentation: 4 Language Modeling by Language Models with Genesys [Abstract]
Abstract: TBD
|
15:00 PM |
Oral Presentation: 5 Towards AI-assisted Academic Writing [Abstract]
Abstract: TBD
|
15:15 PM |
Oral Presentation: 6 Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias [Abstract]
Abstract: TBD
|
15:30 PM | Break 2 |
16:00 PM | In-Person Poster Session |
Accepted Papers (Archival)
-
Variable Extraction for Model Recovery in Scientific Literature
[Archival]
-
How Well Do Large Language Models Extract Keywords? A Systematic Evaluation on Scientific Corpora
[Archival]
-
A Human-LLM Note-Taking System with Case-Based Reasoning as Framework for Scientific Discovery
[Archival]
-
Towards AI-assisted Academic Writing
[Archival]
-
Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications
[Archival]
-
LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study
[Archival]
-
FlavorDiffusion: Predicting Food Pairings and Chemical Interactions Using Diffusion Models
[Archival]
Accepted Papers (Non-Archival Previously Published Papers)
Note: These papers have been previously accepted at other venues, and are highly relevant to AI & Scientific Discovery.
-
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
[Non-Archival Published]
-
Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias (NAACL 2025)
[Non-Archival Published]
-
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses (ICLR 2025)
[Non-Archival Published]
-
Efficient Evolutionary Search Over Chemical Space with Large Language Models (ICLR 2025)
[Non-Archival Prepublished]
-
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models (ICLR 2025)
[Non-Archival Published]
-
DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents (NeurIPS 2024 Spotlight)
[Non-Archival Published]
-
Hypothesis Generation with Large Language Models (EMNLP 2024)
[Non-Archival Published]
Accepted Papers (Non-Archival)
Note: These papers are non-archival, and not included in the official proceedings.
-
VISION: A Modular AI Assistant for Natural Human-Instrument Interaction at Scientific User Facilities
[Non-Archival]
-
Automatic Evaluation Metrics for Artificially Generated Scientific Research
[Non-Archival]
-
WithdrarXiv: A Large-Scale Dataset for Retraction Study
[Non-Archival]
-
FARM: Functional Group-Aware Representations for Small Molecules
[Non-Archival]
-
Scideator: Iterative Human-LLM Scientific Idea Generation and Novelty Evaluation Grounded in Research-Paper Facet Recombination
[Non-Archival]
-
What Can Large Language Models Do for Sustainable Food?
[Non-Archival]
-
Learning to Generate Research Idea with Dynamic Control
[Non-Archival]
-
Map2Text: New Content Generation from Low-Dimensional Visualizations
[Non-Archival]
-
Data Driven Design as a Challenge Task for Few- and Zero-Shot Information Extraction
[Non-Archival]
-
Language Modeling by Language Models with Genesys
[Non-Archival]