The first workshop on Building Innovative Research Systems for Digital Libraries (BIRDS) takes place at TPDL 2025 as a full-day workshop. BIRDS addresses practitioners working in digital libraries and GLAMs as well as researchers from computational domains such as data science, information retrieval, natural language processing, and data modelling. Our interdisciplinary workshop focuses on connecting members of both worlds. One of today's biggest challenges is the increasing information flood. Large language models like ChatGPT seem to offer good performance for answering questions on the web. So, shall we just build upon that idea and use chatbots in digital libraries? Or do we need to design and develop specialized and effective access paths? Answering these questions requires to connect different communities, practitioners from real digital libraries and researchers in the area of computer science. In brief, our workshop's goal is thus to support researchers and practitioners to build the next generation of innovative and effective digital library systems.


Goals of the Workshop

With our highly interactive, in-person, and interdisciplinary BIRDS workshop we want to strengthen the collaboration in the community, connect domain experts with researchers in different contexts and academic states, and promote the construction of innovative research systems for digital libraries.

  • Our goals are to:
    • Provide a forum to come up with collaborations between domain experts and researchers in digital libraries
    • Identify common challenges faced by digital libraries and explore how computational approaches could help address them
    • Provide a forum at TPDL to explicitly discuss rough ideas, problems and ongoing work

Tentative Schedule

Time Agenda
9:00 - 9:15 Welcome
9:15 - 10:00 Keynote by Jian Wu
10:00 - 11:00 Scientific speed-dating
11:00 - 11:30 Coffee break
11:30 - 12:30 General TPDL/ADBIS Keynote: Felix Naumann
12:30 - 14:00 Lunch break
14:00 - 15:00 Panel discussion
15:00 - 16:00 Breakout group discussions
16:00 - 16:30 Coffee break
16:30 - 17:30 Breakout group discussions, summary of discussions and closing

Keynote

Jian Wu - Beyond Retrieval: A Vision of Digital Libraries in the Large Language Model Era


Dr. Jian Wu is an associate professor of Computer Science at Old Dominion University. His research interests include natural language processing, scholarly big data, information retrieval, digital libraries, and the science of science. He obtained his Ph.D. degree in astronomy and astrophysics from the Pennsylvania State University in 2011 and worked as a postdoctoral fellow on the CiteSeerX project. He was promoted to an assistant teaching professor in 2017 and joined Old Dominion University as an assistant professor in 2018. For more, visit his website.

Abstract of the keynote (click to expand) Since 2023, there has been a surge of public and research interest in large language models (LLMs), which has significantly shifted the paradigm of information retrieval from returning keyword-based search results to the generation of natural language responses. This shift brings both challenges and opportunities for traditional digital libraries, which have served as a core infrastructure for browsing, searching, and accessing scholarly content. A critical question emerges: What role should digital libraries play in this LLM era? In this keynote, we share our vision of digital libraries in the LLM era. We argue that digital libraries are still indispensable, not only as repositories for digital preservation and provenance but also for trustworthy metadata discovery and verification. We explore how digital libraries can evolve by integrating LLMs and structured knowledge to support advanced services such as automatic data extraction, scholarly comparison, review generation, and science communication for broader audiences. We share preliminary work in this direction, including initiatives on preserving endangered open-access datasets and software, complex table data extraction, scientific claim verification, and assessing research reproducibility.


Panel Discussion




Organizers

Christin Katharina Kreutz is a Tandem Professor for Data Science in the Humanities working as a practitioner at a GLAM institute (Herder Institute) and in academia (TH Mittelhessen - University of Applied Sciences). Recently, she co-organised the Sim4IA workshop at SIGIR 2024, the SCOLIA workshop at ECIR 2025 and was poster chair for JCDL 2024. Her general research interests are information access systems and user behaviour/simulation.

Hermann Kroll is a PostDoc at the Technical University of Braunschweig. His research focuses on effective access paths in digital libraries. Recently, he proposed a new retrieval paradigm, Narrative Information Access, for digital libraries and developed a real-world discovery system for the pharmaceutical domain. Beyond that, he published in digital libraries conferences such as JCDL, TPDL and ICADL and worked together with digital libraries, such as University Library of Braunschweig, the National Library of the Netherlands, the ZB MED in Germany and the University Library J. C. Senckenberg in Frankfurt am Main.