Courses are held Tuesday 7/22, Wednesday 7/23, and Thursday 7/24. Each class will have a Zoom session at the same time on each of the three days, as specified in their course abstracts. The courses are listed in Pacific time (UTC-7)
Once you’re determined which classes you’d like to attend, you can Register for FSCI today.
Course Number | Course Title | Designated Timeslot | Zoom Length |
Lead Instructor / email | Additional Instructor(s) & Emails |
AI-Assisted Literature Review on Open Access Repositories: Including Image and Object Detection |
7:00am – 8:00am |
60 minutes |
Simon Worthington |
Gitanjali Yadav; Peter Murray-Rust; Ambreen Hamadani | |
| Science Is a Social Process: Data Sharing Practices to Drive Scientific Discovery and Research Integrity |
7:00am – 8:30am |
90 minutes |
Maria Guerreiro | ||
Science Writing in the Age of AI: Where Have All the Authors Gone? |
7:00am – 10:00am |
3 hours |
Francis Crawley |
Dr. Lili Zhan; Dr. Gitanjali Yadav; Professor Perihan Elif Ekmekc; Dr. Chiedozie Ike; Professor Mara de Souza Freitas; Professor Natalie Meyers; Professor Kris Dierickx | |
|
7:00am – 10:00am |
3 hours |
Eleonora Colangelo |
Sidney Engelbrecht; Christopher Magor | ||
|
8:00am – 10:00am |
2 hours |
Daniela Saderi |
Vanessa Fairhurst; Grace Park | ||
|
8:00am – 11:00am |
3 hours |
Kayode Oladapo |
Olukemi Jacobs | ||
Intro to Translation for Non-Translators: Encounter the World of Translation | 10:00am – 11:00am |
60 minutes |
Jennifer Miller |
Lynne Bowker; Oliver Czulo | |
Using the ORCID, Sherpa Romeo, and Unpaywall APIs in R to Harvest Institutional Data |
9:00am – 12:00pm |
3 hours |
Dani Kirsch |
Clarke Iakovakis | |
|
4:00pm – 5:30pm |
90 minutes |
Deborah Khider |
David Edge; Nick McKay; Julien Emile-Geay | ||
|
4:00pm – 5:30pm |
90 minutes |
Teresa Schultz | |||
Beyond Metrics: Overcoming Challenges in Responsible Research Evaluation for Open Science Practices |
4:00pm – 6:00pm |
2 hours |
Ricardo Hartley Belmar |
Soledad Quiroz; Isabel Abedrapo; Eunice Mercado; Arturo Garduño | |
The Butterfly Effect – Understanding the Big Picture Research Ecosystem to Help Open Practice |
5:00pm – 6:30pm |
90 minutes |
Danny Kingsley | ||
|
6:00pm – 7:30pm |
90 minutes |
Dr. Muhammad Imtiaz Subhani |
Amber Osman | ||
6:00pm – 9:00pm | 3 hours | Aaron Tay |
Bella Ratmelia |
E01
Title: AI — Assisted Literature Review on Open Access Repositories: Including Image and Object Detection
Course chair: Simon Worthington, semanticClimate, Publishing Knowledge Graph Researcher
Course abstract
The Assisted Literature Review (ALR) course covers instruction for a semi-automated literature search on any topic from the Open Access literature from Europe PMC, which is a corpus of 6 million open access articles. The AI and machine learning open-source software used is the #semanticClimate text and data mining tooling, with publishing services from Computation Publishing Services (TIB/NFDI4C). The course is a an introduction to Artificial Intelligence Algorithms for data mining including, Natural Language Processing (NLP), Convolutional Neural Networks (CNN), Hugging Face for summarisation, Transformers and YOLO (You Only Look Once) is also included alongside essential data preprocessing techniques. This course introduces literature search, text mining, image classification as well as object detection. The algorithms and the data used are all open-source and issues of trustability for open science are a priority. All instruction is carried out using CoLab Notebooks so no complicated installations are required.
The learning points covered allow for familiarity with AI tooling for literature search and as a package that can be reused by students and researchers. The learning package already exists as a fully documented workflow, with existing CoLab Notebooks — all deposited in Zenodo with DOIs — and the modules have already been taught in workshops, with intern programmes, and in Masters information management courses.
Course units are: 1. Beginners AI-101 (Python, R, HuggingFace, Julyter/Colab); 2. Learning and Developing Human-Machine Knowledge Systems; and, 3. Hands-on training and Testing of NLP, CNN and Transformer-based AI Models.
The results of the AI Literature Reviews (AILR) workflow taught in the class are a literature review report, including: a textual summary, summaries of papers as a data table, the complete full-text articles downloaded, a reproducible and replicable CoLab Notebook with all the software and code used in the review. The content package is contained in a Git repository and Quarto software computational publication and has Zenodo DOIs minted for review as publication and for supporting code and data. The resulting content package can be used in papers, reporting, dashboards, CI pipelines, and for further data analysis.
Contributors
Gitanjali Yadav, National Institute of Plant Genome Research (NIPGR) (Co-course chair); Peter Murray-Rust, Cambridge University (Co-course chair); Ambreen Hamadani (Co-instructor)
Additional Contributors: Renu Kumari, National Institute of Plant Genome Research (NIPGR); Shabnam Barbhuiya, Jamia Millia Islamia University; Joshy Alphonse
E02
Title: Science Is a Social Process: Data Sharing Practices to Drive Scientific Discovery and Research Integrity
Course chair: Maria Guerreiro
Course abstract
New policies and guidance from federal agencies and funders reflect a growing push for greater collaboration, transparency, and accountability in scientific research. Achieving the full benefits of open science (including enhanced reproducibility, equitable access to knowledge, and accelerated discovery) requires the open sharing of data underpinning published research. The value of openly shared data depends entirely on its reusability. Reusable open data is a living asset, a part of a scholarly conversation, that exists to be interrogated, validated, learned from, and built-upon. It also remains a rare asset.
The pathway to increasing data reusability begins with moving a few people to do something different. Individuals winning over other individuals grow a culture for recognising a new behavior that they value.
This session will provide an overview of the evolving policy landscape and its impact on publishing organizations and libraries and offer insight on how community stakeholders can support data sharing best practices and set the stage for productive data reuse.
It will cover relevant policy developments, including their rationale and implications and practical approaches to supporting data sharing that aligns with the vision of a more inclusive, transparent, and collaborative research practices. The course will address several interrelated questions: Why is open data sharing important to research funders? How does data sharing contribute to scientific discovery and research integrity? How do we incentivize data sharing among researchers? What specific practices contribute to a dataset’s reusability? Participants will participate in collaborative exercises to answer these questions and will gain practical knowledge they can apply to their own work in libraries and publishing.
E03
Title: Science Writing in the Age of AI: Where Have All the Authors Gone?
Course chair: Francis P. Crawley, CODATA, Chair of International Data Policy Committee (IDPC)
Course abstract
This course explores the challenges that artificial intelligence (AI) brings to authorship in scientific and scholarly publications. The course will cover the transformative impact of AI on scientific writing, examining how AI redefines creativity, originality, and authorship in the research process. As AI tools generate text, propose ideas, and analyze data, the course investigates what remains of the role of human contributors and how to maintain integrity and accountability in AI-assisted writing.
Participants will gain practical strategies to responsibly integrate AI tools into writing workflows, while ensuring high standards for transparency and ethical conduct. The course will also discuss the evolving nature of scholarly communication and the need for new policies to ensure AI’s responsible integration in research. Through a mix of theoretical discussions and practical exercises, this course will provide participants with actionable insights into AI’s role in shaping the future of scientific communication.
Contributors
- Dr. Lili Zhang, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China (zhll@cnic.cn)
- Dr. Gitanjali Yadav, Group Leader, National Institute of Plant Genomic Research (NIPGR), New Delhi, India (gy@nipgr.ac.in)
- Professor Perihan Elif Ekmekci, TOBB University, Ankara, Turkey (drpelifek@gmail.com)
- Dr. Chiedozie Ike, Irrua Specialist Teaching Hospital & Ambrose Alli University, Edo State, Nigeria (ikeresearch@yahoo.co.uk)
- Professor Mara de Souza Freitas, Director, Institute of Bioethics, Universidade Católica Portuguesa, Lisbon, Portugal (marasfreitas@ucp.pt)
- Professor Natalie Meyers, Professor of the Practice, Lucy Family Institute for Data & Society, University of Notre Dame, Indiana, United States (natalie.meyers@nd.edu)
- Professor Kris Dierickx, The Center for Bioethics & Law, Faculty of Medicine, KU Leuven, Belgium (kris.dierickx@kuleuven.be)
E04
Title: Empowering Future Trainers in Research Integrity: A Course Integrating Theory and Hands-On Exercises in Tools, Stakeholder Engagement, and Science Outreach
Course chair: Eleonora Colangelo
Course abstract
In the landscape of professional discourse, Research Integrity emerges as a cornerstone theme, yet the role of Research Integrity trainers often remains underappreciated. These trainers play a pivotal role in fostering an awareness culture surrounding scientific ethics principles among various stakeholders, including research administrators, librarians, publishing professionals, and researchers. This online course, spanning three sessions over three days, is crafted to address this gap and familiarize a diverse global audience with the vital responsibilities of research integrity trainers within the domain of open scholarship movements.
Throughout the course, we will delve into the multifaceted responsibilities of Research Integrity trainers and explore the future prospects for this cadre of guardians of integrity. Will institutions and publishers increasingly demand their presence? What are the expectations of a Research Integrity trainer? These questions, among others, will be thoroughly examined and discussed.
Developed collaboratively with a focus on research compliance, science editing, and publishing, this course is informed by the latest versions of cross-continental Codes of Conduct in Research Integrity endorsed by governments. Furthermore, the methodology employed is rooted in The Embassy of Good Science project, with co-instructors who are certified trainers.
Our primary objective is to underscore the critical role of research integrity training in shaping robust and trustworthy open science frameworks, particularly in our ever-evolving digital environment. Employing a virtue ethics-based approach, the course will feature interactive sessions aimed at enhancing participants’ understanding of research integrity for future trainers.
While the Dilemma Game app remains an integral part of Day 1 to explore real-world scenarios, it is important to note that it is only one aspect of the training. Other facets of Research Integrity training will be covered, including annual reports on research integrity, intellectual property, conflict of interest, and their implications for the use of AI in science production. Additionally, participants will be guided in crafting strategies and action plans for future Research Integrity trainers, tailored to their specific professional environments.
This course is designed to cater to a diverse audience, including researchers, librarians, publishers, faculty/scholars, policymakers, and research management administrators. Upon completion, participants will have the opportunity to receive a certificate issued by The Embassy of Good Science.
E05
Title: PREreview Open Reviewers Workshop: A Hands-On Equity-focused Peer Review Training Program For Researchers of All Career Levels
Course chair: Daniela Saderi, PREreview, Executive Director, Co-Founder
Course abstract
*PREreview Open Reviewers Three-Part Workshop*
Peer review plays a pivotal role in determining which research projects receive funding, which findings get published, and ultimately, which knowledge is disseminated and utilized by the scientific community and the broader public. Despite its critical importance, reviewers often undergo minimal training for this crucial task. Furthermore, that training rarely focuses on mitigating the biases that are ingrained in the peer-review process. At PREreview, we work to change this.
Open Reviewers is a three-part, hands-on workshop designed for researchers at all career levels who are interested in engaging in socially-conscious and constructive manuscript peer review. Grounded in our values of equity, diversity, and inclusion, the workshop provides participants with the necessary skills and knowledge to conduct equitable peer review using materials from The Open Reviewers Toolkit.
Learning Objectives. Upon completion of the workshop participants will have gained:
– A general understanding of journal-organized and independent review processes
– A detailed understanding of how systems of oppression manifest in the manuscript review process
– Strategies to self-assess and mitigate bias in the context of manuscript review
– An in-depth understanding of and practical experience with peer reviewing a manuscript in a way that minimizes bias, striving for constructive, clear, and actionable feedback
– An opportunity to put learning into practice by collaboratively reviewing a preprint and publishing the resulting preprint review on PREreview.org
– Tips and strategies to adopt and adapt the content of the workshop to the needs of diverse communities
Contributors
Vanessa Fairhurst (co-course chair, co-instructor)
Grace Park (co-instructor)
E06
Title: Data Science for Beginners: Jumpstart into Data Exploration
Course chair: Kayode Oladapo, McPherson University, Seriki Sotayo, Ogun State, Nigeria, Acting Director, ICT-RMU
Course abstract
Are you curious about data science? Taking the first steps in this area is hard, but you don’t have to do it alone. This is for complete beginners and getting started in Python data science—even if you’ve never written a single line of code!
This course is a supportive step that will give you the confidence to get started in this exciting area. It will cover the basics of programming in Python, as well as useful libraries and tools such as Jupyter notebooks, pandas, and Matplotlib.
It is aimed at two groups:
1. Complete beginners to coding/Python who want to get started in data science.
2. People with some experience in Python who want to get started in data science.
Contributors
Olukemi Jacobs, Lux Academy, Ogun State, Nigeria (co-instructor)
E07
Title: Intro to Translation for non-Translators: Encounter the World of Translation
Course chair: Jennifer Miller, Translate Science; FORCE11, Core Contributor, Board Member; Independent Scholar
Course abstract
Have you ever located or organized materials in multiple languages? Do you support researchers working outside of their primary language? Do you use software developed and documented in other languages or locales? Have you developed resources that are translated and localized by others? If so, you’ve encountered the world of translation! But some of these encounters might have left you a little flummoxed, or maybe even a little anxious. That’s totally normal! It can be confusing when we are faced with a language that we don’t know well, and it can be even more confusing to try to find the right tool or resource to help break down the language barrier. If your translation encounters have left you wanting to know a little more about translation, and particularly about tools and resources that can help, then this course is for you! We’ll start by debunking a few of the myths and misperceptions associated with translation and gain a better understanding of what makes translation challenging – for both people and computers! Next, we’ll explore a range of free online tools and resources – from term banks and multilingual concept maps to automatic translation and generative AI tools – to identify different options for supporting ourselves and others in multilingual situations. Finally, we’ll work on improving our machine translation literacy by understanding the role (and limitations) of data in data-driven technologies and picking up some tips to become both savvy and responsible users of these tools.
Contributors
Jennifer Miller (Facilitator)
Lynne Bowker (Lead Instructor), Full Professor, Department of Languages, Laval University
Oliver Czulo (Co-Instructor), CEO, Translatology Institute GmbH
E08
Title: Using the ORCID, Sherpa Romeo, and Unpaywall APIs in R to Harvest Institutional Data
Course chair: Dani Kirsch
Course abstract
The objectives of this course are to obtain a set of ORCID iDs for people affiliated with your institution, harvest a list of DOIs for publications associated with these iDs, and gather open access information for the articles using Sherpa Romeo and Unpaywall.
Students will work with a set of pre-written scripts in R, customizing them for their institutions to access the APIs for ORCID, Sherpa Romeo, and Unpaywall, and bring it all together into a manageable data file.
While some experience using R will be helpful, it is not required. However, although the basics of using R and understanding the code will be reviewed, the emphasis of the course will be on running the scripts and gathering and interpreting the data. In other words, this course is focused not on learning R, but rather on obtaining a dataset of publications based on institutional affiliation and open access information on those publications. It is inspired by a course taught previously at FSCI, available at https://osf.io/vpgbt/. The course will conclude with a discussion of using this data to develop outreach methods to authors to inform them of their right to deposit author manuscripts.
L10
Title: FROGS: Facilitating Reproducible Open (Geo)Science
Course chair: Deborah Khider, University of Southern California Information Sciences Institute, Lead Scientist
Course abstract
Sharing research data, software, and workflows is essential for building a Findable, Accessible, Interoperable, and Reusable (FAIR) open science ecosystem. Over the last decade, funders and publishers have introduced open science policies emphasizing reproducibility, promoting frameworks that support the sharing of reproducible scientific products.
In this course, we will cover principles of open science in a manner that enhances reproducibility, such as: The basics of documenting, sharing, and citing research products (data, software, computational provenance) in line with FAIR principles.
Using GitHub for code development and collaboration, including integration with Zenodo for citation. Exploring containerization with tools like Docker and myBinder, and automating processes with GitHub actions. Publishing software in a software registry. The course will use the online platform LeapFROGS, which integrates Python and R. The platform includes resources and self-paced exercises to reinforce concepts. All materials on LeapFROGS are publicly available.
Contributors
David Edge, Assistant Research Professor, Northern Arizona University (co-course chair)
Nick McKay, Associate Professor, Northern Arizona University (co-course chair)
Julien Emile-Geay, Professor, University of Southern California (co-course chair)
L11
Title: Assessing Institutional Research Data Using OpenAlex
Course chair: Teresa Schultz, University of Nevada, Reno, Scholarly Communications & Social Sciences Librarian
Course abstract
OpenAex, an emerging open source for research data, offers an intriguing alternative to expensive paid options such as InCites and Dimensions. However, working with the data at a large and in-depth level requires more technological know-how that those who work in scholarly communications are not always trained in. This workshop will walk users through how to connect to OpenAlex’s API using an R package and provide them with ready-made code and detailed instructions on how to access different types of data through OpenAlex and then transform and analyze it to learn things like:
Where an institution’s researchers are publishing
Where their coauthors come from
What is their participation like in open access
Who is paying to publish open access, with and without funding
The workshop will also discuss the advantages and disadvantages of using OpenAlex as a datasource, especially in comparison to other tools, along with a background in what research data from these sources can actually tell us and what they can’t. Participants are not expected to be familiar with R or RStudio, although participants should not expect to be experts in R by the end. Participants will be expected to do some work before the workshop, and there will be some homework during the workshop as well.
L12
Title: Beyond metrics: Overcoming challenges in responsible research evaluation for open science practices
Course chair: Ricardo Hartley Belmar, Remolino Consultores, Independent Researcher
Course abstract
This course examines the multifaceted challenges of implementing responsible research evaluation within Open Scholarship. Participants will engage with technical, conceptual, and strategic dimensions of evaluation, emphasizing actionable solutions to drive systemic change. Through the integration of technical insights, qualitative and quantitative criteria, and the strategic role of funders and repositories as sources, the course equips participants to critically assess and enhance research evaluation practices. Discussions will also cover Open Science policies and their alignment with these challenges.
Contributors
Ricardo Hartley: Metaresearcher and knowledge management consultant with experience in Open Science and scholarly communication (Chair)
Soledad Quiroz: Director of institutional relations and public affairs, Data Observatory Foundation and Consultant specializing in Open Science policy and governance at Remolino Consultores
Isabel Abedrapo: Expert in repository management and metadata quality at Remolino Consultores
Eunice Mercado: Scholar focused on interdisciplinary research evaluation and community engagement
Arturo Garduño: Research evaluation and data metrics specialist (to be confirm)
L13
Title: The butterfly effect – how understanding the big picture research ecosystem will help open practice
Course chair: Danny Kingsley, Deakin University, Director Library Services (Information)
Course abstract
The concept of the butterfly effect – that the world is deeply interconnected, such that one small occurrence can influence a much larger complex system – can be directly applied to the research ecosystem. Everything is interconnected, interdependent and interrelated. This course is an attempt to articulate these connections and identify areas where change might be possible.
Many aspects of research operate in isolation from each other yet are part of an interdependent whole. Areas such as research culture, research assessment, open scholarship, research integrity, research support, research infrastructure and research impact can be managed by completely different agents within a research institution (if at all). The training we offer all members of the research endeavour does not currently take a holistic view, which makes effective decision-making and positive change deeply challenging.
Recent events in the USA have demonstrated how fragile research integrity and trust in science actually are. The principles of open research are more essential than ever in this environment.
In this course classes will be 1.5 hours long, with three classes and will build on an hour of pre reading/watching per class. The classes will combine direct instruction with discussion, small group work and whole group activity. The goal for each lesson will be to collectively develop some resources (digital, physical and conceptual) to help articulate these concepts and issues to a broader audience.
Each class of the course will focus on a different area:
• How bad is it really? A dive into the accelerating increase in poor research practice, paper mills, fraud, retractions … (yes this is the depressing lesson!)
• How did we get to this? Looking back to see forward. What has been happening over the past 20 years in terms of changing research assessment, the ownership of research infrastructure, and the rise of the Open movement
• What’s working? A celebration of – and critical look at – initiatives that are shifting the dial. The common thread is the connection across different aspects of the ecosystem. A (possibly lofty) goal of the course is to collectively develop a visual representation of ‘the fundamental interconnectedness of things’
This course is aimed at all participants in the research endeavour – from researchers to research administrators, librarians, third space professionals – everyone. It is intended to be interactive and constructive where all participants can contribute to the process.
L14
Title: Breaking Barriers and Setting Standards: Advanced Assessment of Open Access Journals and Addressing Global Imbalances with DOAJ
Course chair: Dr. Muhammad Imtiaz, XploreOpen/ DOAJ, Founder of XploreOpen / Ambassador & Editor of DOAJ
Course abstract
Open access is transforming scholarly publishing, yet challenges remain in ensuring quality, transparency, and inclusivity. This course, Breaking Barriers and Setting Standards: Advanced Assessment of Open Access Journals and Addressing Global Imbalances with DOAJ, offers an in-depth exploration of the tools and strategies necessary to elevate the standards of open access publishing. Participants will engage in hands-on evaluation of journals using DOAJ criteria, delving into complex aspects like licensing, metadata quality, and APC transparency. Through case studies and practical exercises, we’ll uncover common pitfalls in journal applications and learn how to avoid them.
In addition, the course addresses critical global disparities in scholarly publishing. We will examine the barriers faced by journals in low-resource regions, discuss DOAJ’s role in fostering equity, and explore capacity-building strategies to empower editors worldwide. Whether you’re an editor, librarian, researcher, or advocate, this interactive session equips you with actionable insights to advance open access and create a more inclusive and high-quality publishing ecosystem.
Contributors
There are two instructors of the course, that include myself (Dr. Subhani) and Amber Osman (Both are the Co-Founders of XploreOpen & Ambassadors and Editors of DOAJ).
L15
Title: AI-Powered Search in Libraries: A Crash Course on understanding the fundamentals for Library Professionals
Course chair: Aaron Tay, Singapore Management University, Head, Data Services, SMU Libraries
Contributor
Bella Ratmelia (bellar@smu.edu.sg), Research & Data Services, Senior Librarian – (co-instructor).
Course abstract
As Generative AI technology becomes increasingly integrated into academic search, it is crucial for librarians to understand the fundamental technical aspects of information retrieval and large language models (LLMs). This course offers an in-depth exploration of key concepts such as Retrieval-Augmented Generation (RAG), semantic search techniques (e.g. Dense embedding bi-encoder, ColBERT) versus lexical search (e.g. TF-IDF, BM25), and LLM fundamentals. Designed specifically for information professional or researcher who are interested in learning the basics, this course equips participants with the knowledge needed to understand at a conceptual level how these new tools work, the implications of using such tools with the aim to evaluate and provide guidance to users. No coding knowledge is required.
Note: The use of AI assistance for evidence synthesis (e.g. Systematic review) is not the main focus.
The course is structured into three comprehensive 3-hour sessions:
Session 1: Crash Course on LLMs and Prompt Engineering
This session introduces participants to the fundamentals of large language models (LLMs) and Retrieval-Augmented Generation (RAG). It also covers the basics of prompt engineering and its role in optimizing AI-powered search experiences.
Session 2: Semantic vs. Lexical Search and RAG
This session delves into the differences between semantic and lexical (keyword-based) search. Participants will explore several AI-powered search tools such as Undermind.ai, Scite.ai, and Elicit, and their varying approaches to information retrieval. This session will also go over Retrieval-Augmented Generation (RAG).
Session 3: Evaluation Metrics and Implications for Scholarly Communication
In the final session, participants will gain a deeper understanding of common evaluation metrics for information retrieval systems. The session concludes with a discussion on the implications of these technologies on various aspects of research workflow, including literature searching, scholarly communication, and open access.
By the end of this course, librarians will have the tools to critically assess AI search tools and support their institutions in adopting and using these technologies effectively.