Skip to content

When you choose to publish with PLOS, your research makes an impact. Make your work accessible to all, without restrictions, and accelerate scientific discovery with options like preprints and published peer review that make your work more Open.

PLOS BLOGS Speaking of Medicine and Health

Lack of computational capacity for pathogen genomics in Africa

By guest contributors Eric Agboli, Molalegne Bitew and Monika Moir

African countries bear the highest disease burden from communicable diseases, coupled with the lowest relative gross domestic product per capita in the world. This vulnerability leaves the African continent ill-equipped to respond effectively to infectious disease outbreaks. Several factors contribute to the higher incidence of infectious diseases on the continent, such as reduced access to health care, high endemicity of certain infectious diseases, socioeconomic limitations, and high vulnerability to climate-dependent disease transmission. Studies have shown the need to improve basic knowledge of health institutions to better respond to public health needs with emphasis on incorporating pathogen genomics for disease surveillance and outbreak response strategies. The Centre for Epidemic Response and Innovation (CERI) and the KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP) in South Africa are responding to such calls to action through the global Climate Amplified Diseases and Epidemics (CLIMADE) consortium by providing pathogen genomic sequencing and bioinformatics training for public health surveillance and disease response. These training initiatives aim to equip African scientists with the necessary skills to perform inhouse pathogen genomic surveillance such that they can prevent and manage infectious disease outbreaks in their home countries and regions. Since 2020, these institutions have run 10 such training events and hosted over 510 fellows from 48 countries for training on pathogen genomics.

The training programs are designed to cover all wet-laboratory (laboratory in which wet chemicals and biological materials are handled and analysed) steps of pathogen genomic sequencing (nucleic acid extraction, amplification, library preparation, and sequencing), as well as the bioinformatics (dry-lab) components of base-calling, sequence assembly and editing, basic phylogenetic analysis and reporting sequencing results for public health applications. In this blog, we discuss one such training event that was held in Cape Town, South Africa in April 2023. Thirty-six participants from 16 countries attended this training, with 14 of those countries being low- and middle-income African nations.

The trainees consisted of Laboratory Technologists, Postgraduate students, Postdoctoral Research Fellows, Clinical Infectious Diseases Specialists, Public Health Officials, and Senior Scientists. Trainees represented public health institutions (48%) and universities or non-profit research centers (52%, Figure 1A). In terms of educational backgrounds, all trainees held at least a Bachelor’s degree, primarily in fields of Biological and Biomedical Sciences, while 41% had completed Master’s degrees, and 15% had obtained PhD degrees (Figure 1B). Prior to attending this training, 14% had no previous training in genomics or bioinformatics, 43% had received formal training in these areas, while 23% had self-studied, and 20% had a combination of self-study and formal training (Figure 1C). Before the training, trainees were asked to rate their skill levels in wet-laboratory protocols and computational bioinformatics abilities (Figure 1D). We noticed a skew in existing skillsets, with the majority reporting having expertise in wet-laboratory protocols, for example 80% of attendees had performed nucleic acid extractions and 85% polymerase chain reaction (PCR) methods; while as many as a third of trainees had no prior experience in dry-lab methods and only 15% had experience in advanced sequence analysis. From this initial assessment, it seemed the greatest gap in skills of trainees was computational.

Figure 1: Professional profile of training attendees A) Affiliated institution, B) Degree of highest qualification, C) Type of genomics training prior to attending training event described here, D) Percentage of trainees that had previous experience with specific components of the wet- and dry-lab sections of the training.

Considering only the dry-lab component of the training, the program consisted of five hours of introductory computational skills, 11 hours covering genome assembly, quality control and analysis, five hours for basic phylogenetics, 1.5 hours for science communication for public health, and three hours for recapping and Q&A sessions. On conclusion of the training, trainees provided qualitative feedback of their experiences and perceived outcomes of the training. A great majority of the trainees described limitations to follow the computationally intensive bioinformatics sessions. They described a need for more learning materials and learning time for computational sessions, more repetition of exercises in bioinformatics sessions, and a reduced confidence in their abilities to independently perform the bioinformatics methods than wet-lab protocols.

We were reminded that even amongst highly educated and successful public health professionals, adequately trained bioinformaticians are scarce in Africa. But also, that it takes substantial time, support and computational infrastructure to adequately upskill professionals in this field, which is particularly challenging in resource-limited settings such as low- and middle-income countries in Africa. The COVID-19 pandemic highlighted the Global North-South divide in genomic sequencing capacity and availability of resources, with low- and middle-income countries sequencing a smaller proportion of SARS-CoV-2 genomes with a slower turnaround time. However, sequencing capacity was vastly increased during the pandemic by active investments from the Africa Centres for Disease Control and Prevention (Africa CDC) via the Africa Pathogen Genomics Initiative. Genomic surveillance programs were crucial in the pandemic response as they allowed public health officials to track the evolution of the virus; and allowed for the pivotal identification of globally significant SARS-CoV-2 variants of concern in Africa.

We have witnessed impressive progress in the capacity to generate genomic data in Africa but alongside that, we must address the growing need for bioinformatics skills to process and analyse these large biological datasets. We must continue to build capacity in science, with emphasis on computational, bioinformatics and data science skills, at all educational levels for further progresses of public health on the continent. But also, to ensure that local African scientists have equitable opportunities to be field leaders. It is important to continue this fellowship program, along with growing other bioinformatics and data science learning opportunities, in order to secure public health skills in Africa for a healthy and prosperous continent and world.

About the authors:

Eric Agboli is a Lecturer at the University of Health and Allied Sciences, Ho, Ghana and a Postdoctoral research associate at the Bernhard Nocht Institute for Tropical Medicine, Hamburg, Germany.

Molalegne Bitew is an Associate Professor, Lead Researcher, and Director at the Health Biotechnology Directorate, Bio and Emerging Technology Institute of Ethiopia, Addis Ababa, Ethiopia.

Monika Moir is a Researcher at the Centre for Epidemic Response and Innovation, School for Data Science and Computational Thinking, Stellenbosch University, South Africa.

Disclaimer: Views expressed by contributors are solely those of individual contributors, and not necessarily those of PLOS.

Related Posts
Back to top