Sharing digitized DNA sequences must balance scientific progress with fair use

Technician Matthew Smith loads a robotic DNA sample automation machine

With the creation of high-throughput DNA sequencing technology, researchers can now generate entire genome sequences for all biodiversity on Earth. Converting the biochemical nature of DNA into digital format allows the global and rapid sharing of this information which creates more reproducibility in research, data access for all, and allows other researchers to ask new questions of the same data. Shareable, open-access digital sequence information enhances the global scientific enterprise. However, the free use of this digital sequence information without benefiting the nations where biodiversity lives creates a conflict with the Nagoya Protocol on Access and Benefit Sharing. A balance must be struck between regulating the fair use of digital sequence information and international scientific collaboration.

What is the Nagoya Protocol?

The Nagoya Protocol is the supplement to the Cartagena Protocol on Biosafety to the Convention on Biological Diversity, the international agreement that governs sharing of benefits arising from use and study of genetic resources. Effective as of 2014, the Nagoya Protocol has been ratified by 109 countries and provides the standard for use of genetic resources and genetic material. Genetic resources and materials are generally viewed as any material from an organism that contains DNA, RNA, or proteins.

Collecting a sample of tissue from an organism falls under the Nagoya Protocol, but there has been some debate about whether digital sequence data derived from tissue samples are also included. Because the Nagoya agreement is designed to manage and regulate benefits from utilization of genetic material, it would also cover the sequencing of genomes derived from sample collection. With the advent of modern biotechnology and genome sequencing, a growing number of signatory states believe that the Nagoya Protocol needs an update to incorporate digital sequence information.

Sharing of genetic sequence data

The International Nucleotide Sequence Database Collaboration (INSDC) is a tripartite collaboration of sequence databases in the U.S.A., Europe, and Japan that has committed to providing access to nucleotide sequence for more than 30 years. To date, there are more than three trillion annotated nucleotide bases shared free-of-charge on the INSDC. These databases have allowed researchers worldwide to identify genetic mutations, causes of numerous diseases and ways to treat or prevent them, and opened new insights on the origins and potential mitigation of various types of cancer. Pharmaceutical companies often rely on these databases to develop new and more effective medications and treatments. This saves precious, limited research funding by making use of pre-existing, publicly available data. Sharing data enables a more robust scientific enterprise that ultimately benefits society globally.

Digital sequence information presents a problem because it allows the transnational movement of genetic sequence without moving organisms across borders. Digital sequences that are publicly available at the INSDC but derived from organisms collected legally make it difficult to enforce the Nagoya Protocol.

The collection of samples is regulated by government permitting agencies and importation of samples is enforced by customs officials. While the collection of biological samples and subsequent development of biotechnological innovations would be considered theft, using freely available digital sequence information derived from legally collected samples is not. Nothing prevents downloading and analyzing sequences derived from organisms. As technology advances, it may be possible someday to generate an organism from sequence data alone. Using these sequences and placing them into new synthetic organisms as is increasingly common in the DIY Biology movement would violate the Nagoya Protocol. The challenge is how to share benefits from the free use of digital sequence information.

A complete ban on sharing data is not a solution

Ceasing to share data might address Nagoya related issues with digital sequence information in the short-term, but the long-term outcome will stifle scientific progress. This conflict is further heightened into a social justice issue because most biodiversity is found in developing nations. Protection of indigenous knowledge and resources are important, and Nagoya Protocol is designed to protect genetic resources and creating fair use agreements. This changes a long history of exploiting biodiversity that has unequally benefited developed nations at the expense of developing nations.

Despite these corrections, careful thought is needed to balance the rich collaborations possible from genome sequencing and recognizing the need to give benefits back to local communities and nations where organisms originate. Collaboration and sharing digital sequence information increases the opportunity for development of new drugs, pest control solutions, and preservation of biodiversity. Furthermore, ceasing to share digital sequence information is likely impossible in our increasingly digital world.

Upcoming negotiations this November will focus on what to do with digital sequencing information. Shutting down existing collaborative platforms or creating a fee-for-service agreement will stifle scientific progress, especially from smaller academic research laboratories. Rather than shutting down access to digital sequence data, blockchain encoding could be a way to track sequence sources that might ultimately be turned into profits. Alternatively, putting limits on downloading sequences for non-profit usage only could help prevent the unequitable sharing of digital sequences derived from biodiversity.

A careful balance must be created when considering the future use of digital sequence information. Open use of genome sequences have uncountable benefits but it would be wise to initiate non-profit usage agreements to help prevent exploitation of freely available sequence information.