Why Open Code Matters – and Why It’s Hard

November 14, 2025 PLOS Meet Your Editor

This blog post was prepared by Editorial Research Associates Shein Ei Cho and Marcus Pawson.

PLOS Digital Health Academic Editors are established researchers and thought leaders in their fields. For this post, we reached out to Dr. MinJae Woo and Dr. Martin Frasch and posed the question, “Why are you sharing code for your field? What support do you need from your peers, community and PLOS?”.

MinJae Woo

MinJae Woo, Ph.D., is an Assistant Professor of Data Science in the Department of Public Health at Clemson University. His research bridges artificial intelligence, public health, and information systems, focusing on fairness, transparency, and responsible adoption of predictive models. He has led multi-institutional teams across academia and industry to develop explainable and equitable AI systems for healthcare and financial applications. Over the past eight years, all of his publications have been open access with most accompanied by publicly available code or data when applicable.

Why I share code

Sharing research code can significantly strengthen the scientific narrative. Well-written codes not only visualize the logic behind a paper’s findings but also show how the analytic reasoning was implemented. In many cases, these codes communicate the “how” more effectively than lengthy text descriptions or equations. For early-career researchers, publishing codes can serve as a living portfolio that demonstrates analytical expertise, transparency, and reproducibility. It can also advocate integrity of your work; if an honest mistake is later discovered, shared code enables verification and correction in good faith rather than speculation about misconduct. In that sense, open code should be seen as both a scientific contribution and a personal accountability mechanism.

Why I sometimes hesitate to share

Sharing code is not as simple as uploading a script. It is like inviting guests into your home. Day to day, you live comfortably and keep things reasonably clean; however, before guests arrive, you put in extra effort in tidying, organizing, and making sure things are in good order. Preparing research codes for publication requires similar effort in reorganizing files, annotating logic, removing sensitive data paths, and documenting dependencies that were obvious to your team. This process can take days or weeks.

Once shared, another challenge begins — questions after questions. Few are insightful and lead to constructive dialogue, whereas many others arise from misinterpretation or lack of context. Responding to every query demands time that most researchers simply do not have, especially early-career faculty balancing teaching, research, and funding pressures. Yet ignoring questions may be perceived as evasive or even raise doubts about the study’s integrity. Therefore, the cost of sharing is not only technical but social and reputational.

What support is needed

To foster a culture of code sharing, three forms of support are crucial.

Structural support: Journals and institutions could provide templates, repositories, and standardized documentation guidelines tailored to different disciplines. Researchers often hesitate not out of unwillingness but because they are uncertain about what qualifies as “good enough” for code sharing.
Cultural support: Peer recognition should extend beyond publications. Shared code should be treated as a legitimate scholarly product that is citable, reviewed, and credited in professional evaluations such as promotion and tenure guidelines. Recognizing code contributions on par with publications would encourage sustained participation and reward transparency.
Community support: Publishers can play a mediating role by hosting moderated discussion spaces where authors can clarify implementation details without being overwhelmed by redundant or accusatory questions. Constructive engagement must be facilitated, not merely expected.

Ultimately, I believe sharing code should not be treated as a moral obligation but as an evolving professional practice. Researchers share when they trust that the community will interpret their efforts in good faith and that their time investment will be respected. With the right infrastructure and culture, sharing code can indeed become the default practice; not because it is required, but because it is valued and appreciated.

Martin Frasch

Dr. Martin Frasch is an internationally recognized health researcher in neuroscience, physiology, wearables, pregnancy health, AI/ML, and biomarker discovery; 116+ peer-reviewed publications; health tech entrepreneur.

From My Terminal to Our Community: Transforming Code Sharing into a Scientific Asset

As a researcher deeply embedded in computational work, the code I write—the scripts for data analysis, simulation, and figure generation—is not merely a tool; it is the fundamental method. This code, alongside the data it processes, constitutes the complete, verifiable blueprint of my research. This centrality is why open science is essential, yet the path from advocating for sharing to actually doing it is fraught with difficulty.

The Imperative and the Impediments

The ideal reason for sharing is reproducibility and transparency. A published paper without accessible code and data is a blueprint with an unreadable methods section. When we share our assets, we replace a “trust me” black box with an open, verifiable process, accelerating scientific progress and fostering trust, particularly critical in sensitive fields like biomedical AI, where published findings must be carefully safeguarded against bias.[1]

Despite these benefits, sharing rates remain suboptimal.[2] The barriers are practical, cultural, and systemic. Writing code for personal use is fast and messy; preparing it for others requires significant extra time for cleaning, commenting, and robust documentation. This effort often clashes with the academic ecosystem, which counts papers and grant funding, but provides little to no formal academic credit for high-quality code and data repositories. This lack of incentive is the single greatest barrier. Until this incentive structure is fixed, individual guilt and good intentions will always lose to career pressures. [3]

Furthermore, fields like healthcare machine learning face real constraints, including HIPAA compliance and IRB protocols, which complicate or prohibit the free release of sensitive clinical data and associated pipelines. Perfectionism and competitive anxiety also play a role; researchers often delay sharing, fearing their work-in-progress might be judged amateurish or lead to them being scooped.

What We Need: Systemic Support

Making open code the norm requires a collective framework supported by all stakeholders.

From my Peers: We need to foster a culture that accepts and normalizes sharing assets that are “good enough” for replication, even if not perfectly polished. Reviewers should engage in constructive criticism rather than harsh judgment. Most importantly, we must cite the software and datasets we use, treating them as first-class research contributions.[2]

From the Community (Institutions and Funders): The primary need is credit. Tenure and promotion committees must be explicitly guided to value code and data as legitimate research outputs. This is where new frameworks are essential. The proposed QIC-Index—which scores research outputs based on their Quality, Impact, and Collaboration—provides a concrete mechanism to quantify the value of shared code and data, linking diligence to tangible recognition.[3] We also need dedicated support, including integrated “software carpentry” training and resources for Research Software Engineers (RSEs) to help labs maintain and publish their infrastructure.

From PLOS: Publishers can drive change by continuing to strengthen and clarify code and data-sharing policies, moving from “encouraged” to “expected.” We need clear guidance on availability requirements before submission and tiered expectations that account for the different requirements of theoretical, computational, and clinical code. Implementing an optional track for linked peer review of code and data would add immense value. Finally, PLOS should integrate QIC-like metrics or use visibility initiatives to track and celebrate the downstream impact of shared code, recognizing these infrastructure builders in editorial decisions and via tools such as a “Data/Code Impact Statement” or QIC badge system. [3]

Sharing our computational methods is critical for future scientific rigor. We must transform this work from an unrewarded burden into a supported, credited, and fundamental part of the publishing process.