Two cutting-edge technologies currently used in the NIHR BioResource are RNA sequencing and Long Read Sequencing. Both technologies are applied to Rare Disease cohorts but in two distinct projects. The Rare Diseases RNA Phenotyping project (RDRNA) and the new Long Read Whole Genome Sequencing project (LRWGS) respectively.
How has the Long Read Sequencing project progressed since its launch? Are there any key findings or milestones achieved so far?
The Long Read Sequencing project started in the summer of 2023. It is a programme that applies Oxford Nanopore technology (ONT) to sequence up to 22,000 samples from volunteers with Rare Diseases, Eating Disorders and the general population (participants for pre-symptomatic studies of neurodegeneration and dementia).
We have completed sequencing a quarter of the samples since its launch, finished the pilot phase for some areas of the project and have currently scaled up to the production phase for those areas, which is a key milestone for the project.
In the pilot, it was important to establish the baseline quality metrics for the projects so we could be sure all the genetic variant types will be detected to enable the overall analysis and discovery of mechanisms behind the diseases included in the projects.
One final confirmation that the technology was working to the required level was to prove that diagnostic results from some complicated cases that had been solved during the previous short read Whole Genome sequencing project were detected by the new LRS technology.
We were very pleased to see the data from these confirmatory tests coming through and very clearly identifying the same variants and with additional data to help confirm the diagnosis without requiring family member samples.
We also needed to learn how to scale up the technology so we could generate a planned 384 sequences every week, there is a lot of careful laboratory work required to prepare the DNAs ready to sequence.

What challenges have you encountered in scaling up the use of LRS technology, and how have they been addressed?
The LRS technology is an exciting technology because it allows the sequencing of longer fragments of Deoxyribonucleic acid (DNA) and therefore enables us to detect complex differences between genomes involving larger segments of DNA.
One of the challenges we have faced is a better understanding of the technology as this is the first time it is being used in the BioResource. Other complexities include the installation of new equipment, understanding and testing the new equipment and generating standard operating procedures (SOPs) which the teams will use going forward.
For example, one step in the LRS SOP is to prepare a library from a DNA sample. A small number of libraries can be made manually but if we want to scale up it becomes quite difficult to maintain the quality with so many samples, multiple technicians and two lab sites. We have recently transferred from doing manual library preparation to automation using a robot. This allows for the creation of 96 sample libraries at a time. The LRS project is a work in progress as we continue to learn and troubleshoot.
Another major challenge is that the samples have been collected and banked over many years including new recruitment specifically for the Rare Disease projects. There are also different sample types which include blood or saliva as the DNA source. Using saliva is more tricky as the DNA extracted can contain bacteria or food and therefore not all the sequences map to the human genome. Our goal is to generate more sequences to compensate and provide the good data coverage needed to make definitive conclusions.
How is the RNA Phenotyping Project progressing? Have you observed any early trends or discoveries that could impact our understanding of rare diseases?
For the RNA Phenotyping project, we started off with a goal of recruiting 1,000 Rare Disease patients across 9 Rare Disease BioResource projects to participate. To date, the number of Rare Disease BioResource projects has gone up to 17 and we are less than 200 participants away from our goal.
Once the initial blood samples have been processed and the cells and plasma have been banked then the omics assays need to be started. We have completed 50% of the RNA sequencing and 25% of the short read genome sequencing and proteomics while the Long Read Sequencing is starting now the pilot phase has been completed.
Bioinformaticians are beginning to analyse the large amounts of complex data arriving at the computing facility, but the power of the resource will be improved as we fill in these datasets more with the full omics sets.
We are very excited to begin to generate some genetic diagnoses for patients which will directly impact their care. In our previous genome sequencing project, we were able to diagnose, on average, around 20% of participants. While there are likely to be disease-causing genes yet to be discovered, it is thought that most genetic diseases are due to changes in genes that we already know about. However, the challenge lies in deciding which genetic variants are disease-causing and which are benign. Using RNA sequencing means that we can work backwards, identifying genes which aren’t working properly, and then search the genome to find the changes that might be causing this.
If anything is found, these results will be fed back to the patient via their referring clinician. Having a genetic diagnosis can not only impact treatment and management options but can also inform reproductive choices and testing options for the wider family.

What new technologies or techniques have been integrated into the project to enhance RNA analysis? Are there any innovations that have particularly improved sample processing or data interpretation?
Long Read Sequencing was added to the RNA sequencing project when the funding became available. We hope this exciting new technology will help with complex structural variants, repetitive regions and compound variants in the same gene where knowing which chromosome each variant is on is important for diagnosis.
Are there any collaborations with other research groups or industry partners you are able to share or talk about that have helped with the work for both LRS and RNA?
We collaborate with the King’s Genomics Lab at King’s College London for the LRS project. Oxford Nanopore are key for the LRS project as they have provided the technology and machines. For RNA, we work with Sanquin in the Netherlands who are providing expertise on proteomics (large-scale study of proteins). This work will let us determine if the genome variations observed are influencing the level of proteins expressed rather than just their structure.
A huge thank you to Lilian, Vicky, Jacinta, Kathy and Hannah from our Rare Diseases, RNA, LRS and Samples team for taking the time to chat with us about their work, its importance, the technologies used and the continued progress.
Rare Disease Day 2025, unites over 100 countries to raise awareness for the 300 million people worldwide living with rare diseases. This annual event emphasizes the need for equitable healthcare, improved access to treatments, and enhanced medical research for those affected.
In alignment with this mission, the BioResource is advancing rare disease research through its RNA Phenotyping and Long Read Sequencing projects, aiming to improve diagnosis and treatment options for patients.