A coffee cup holds the world’s data, where is the magic of the DNA memory chip?

In the future, the problem of data storage will become a pain point in the development of the Internet. In order to solve the problem of data storage, inspired by biology, researchers targeted the DNA in the human body. The largest human chromosome contains nearly 250 million base pairs. If data can be stored ON each base pair, in theory, MIT Bioengineering Professor Mark Bathe said, a coffee cup full of DNA can store the world All the data on it.

Author: Core Things

In the future, can DNA memory chips replace traditional storage hard drives?

We are in an era of data explosion, and the amount of global data is growing exponentially! International data agency IDC predicts that by 2025, the global data volume will reach 175ZB, with a 5-year compound growth rate of 8%. 1ZB is equal to 1 trillion GB. If 175ZB data is loaded with a mobile hard drive with a capacity of 1GB, at least 175 trillion hard drives are required.

In the future, the problem of data storage will become a pain point in the development of the Internet. In order to solve the problem of data storage, inspired by biology, researchers targeted the DNA in the human body. The largest human chromosome contains nearly 250 million base pairs. If data can be stored on each base pair, in theory, MIT Bioengineering Professor Mark Bathe said, a coffee cup full of DNA can store the world All the data on it.

In this way, storing 175ZB of data is no problem. Such a promising emerging storage technology was written into the draft of the “14th Five-Year Plan” in March this year. Not only that, the endless related research and implementation progress in 2021 will make DNA storage technology more and more attention.

For example, on January 11, the Nature sub-Journal published a paper on Columbia University’s translation of hello world into base language into E. coli DNA; on May 26, Zhongke Carbon Yuan, which was incubated by the Shenzhen Institute of Advanced Technology of the Chinese Academy of Sciences, was established. Focus on advancing the R&D and commercialization of DNA data storage; On November 12, Liu Hong’s team from Southeast University published a paper on the school motto “Stop with the best” into DNA in Science Advances; On November 24, Microsoft announced the first nanoscale DNA storage writer…It should be noted that in a broad sense, DNA chips are tools for genomics and genetics research, which refer to the in-situ synthesis of oligonucleotides on a solid support or the direct integration of a large number of pre-prepared DNA probes. The needle is solidified on the surface of the support in an orderly manner by microprinting, and then hybridized with the marked sample. Because the surface of the support is often a computer chip, it is called a DNA chip.

There are many types of DNA chips, including those used to detect genes and chromosomes or for clinical diagnosis, and the one that mimics the structure of DNA for data storage is the focus of our discussion today, that is, DNA storage chips.

01. The base corresponds to the binary system, and the long DNA chain of a human hand can store 1 billion G data

From the patterns carved on ancient stone walls to the appearance of words, to the production of books, the most important information carrier, we actually don’t produce much information. But since entering the information age, the information recorded by mankind in the past 50 years has far exceeded the information recorded in the past 2000 years. We are in the big data era of information explosion. All information on the Internet is saved as data, from web pages and applications to security and satellite fields.

According to data from the International Data Organization IDC, from 2013 to 2015, the global big data storage volume was 4.3ZB, 6.6ZB, and 8.6ZB respectively, and the growth rate remained at about 40%. By 2016, the global big data storage volume reached 16.1ZB. The growth rate reached 87.21%. From 2017 to 2019, the global big data storage volume was 21.6ZB, 33ZB, and 41ZB respectively, and the global data volume reached 60ZB in 2020. While the field of big data continues to develop, in order to meet the demand for massive data storage, storage methods are constantly changing.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲IDC monitors the trend of global data volume changes from 2015 to 2020 and forecasts for 2025

DNA is a carrier for storing genetic information. It carries the genetic information necessary to synthesize RNA and protein. It can encode all biological information. In the 1950s, researchers discovered the relationship between biological features and man-made objects. The DNA molecule is composed of four bases, and the data is composed of binary 0 and 1. DNA is used to store genetic information, and the data just needs a medium to store, so the Soviet physicist Mikhail Samoylovich Nei Man (Mikhail Samoilovich Neiman) thought, is it possible to store data by referring to the DNA structure?

Different from traditional storage media, DNA storage technology has the following significant advantages. The first is the high storage density of DNA. A DNA molecule can retain all the genetic information of a species. The largest human chromosome contains nearly 250 million base pairs, which means that a DNA strand about the same length as a human hand can store 1 EB (1EB = 174 million G) data. .

Compared with the data storage density of hard disk and flash memory, hard disk storage is about 1013 bits per cubic centimeter, flash memory storage is about 1016 bits, and DNA storage density is about 1019 bits. The second is the stability of DNA molecule storage. In February of this year, a paper in the top international academic journal Nature stated that paleontologists had extracted the genetic material of mammoths 1.2 million years ago from the permafrost in northeastern Siberia and analyzed their DNA. It also further refreshed the age record of DNA molecules.

It is reported that DNA can retain data for at least a hundred years. In contrast, data on hard drives and tapes can only be retained for about 10 years. Finally, DNA storage and maintenance costs are low. Data stored in the form of DNA is easy to maintain. Unlike traditional data centers, it does not require a lot of manpower and financial investment, and only needs to be stored in a low-temperature environment. In terms of energy consumption, the energy consumption of 1GB data hard disk storage is about 0.04W, while the energy consumption of DNA storage is less than 10-10W.

02. Low-cost expansion can place millions of DNA sequences

In the 1950s, scientists had proposed the idea of ​​creating man-made objects similar to the biological characteristics of the micro-world, and believed that the man-made objects would have more extensive capabilities. In less than ten years, Soviet physicist Mikhail Samoilovich Neiman independently proposed the possibility of using DNA and RNA molecules to record, store and retrieve information.

The application of DNA for data storage really started in 1988. The artist Joe Davis and Harvard University researchers collaborated in the DNA sequence of Escherichia coli and put a picture of ancient Germanic runes representing life and the female earth through 5×7. The matrix is ​​stored in the DNA sequence. They use 1 in the binary system to represent the dark pixels in the picture, and 0 to represent the bright pixels in the picture. In subsequent studies, the researchers proposed a variety of encoding methods for DNA storage.

In 2011, the research team coded a 659KB book. Through one-to-one correspondence, adenine or cytosine represents 0 in binary, and guanine or thymine represents 1. However, when the researchers finally checked the data storage results, they found that there were 22 errors in the DNA. The accuracy of this one-to-one corresponding encoding method is low. DNA is composed of four bases combined into base pairs, which form a spiral structure. The four bases are adenine (A), thymine (T), guanine (G), and cytosine (C). Then, according to the principle of complementary base pairing, DNA molecules are arranged to store genetic information. These four codes also provide a suitable coding environment for the DNA memory chip.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲Schematic diagram of DNA molecular structure

DNA storage technology includes four steps of information encoding, storage, retrieval, and decoding. In a computer, data storage needs to be represented by binary 0 and 1. To use DNA to store data, you first need to convert 0 and 1 into the four bases A, C, T, and G in DNA to create a base sequence with the correct base sequence. DNA spiral structure. The synthesized DNA is stored in vivo or in vitro.

When decoding, the DNA sequencer will transcribe the base sequence in the DNA structure and convert it to 0 and 1 through the decoding software to restore the data information. In 2012, the Harvard University research team confirmed that DNA can be used as a storage medium similar to hard drives and tapes. They use DNA to encode digital information, including 53400-byte HTML drafts, 11 JPG pictures and a JavaScript program, using bit-to-base one-to-one mapping, but this method will make the same base run for a long time and sequence The process is prone to errors.

This simple one-to-one encoding form achieved a breakthrough in 2013. Researchers from the European Institute of Bioinformatics (EBI) stated in the paper that they have achieved the storage, retrieval and reproduction of more than 5 million bits of data, and all DNA files have reproduced the information with an accuracy of 99.99% to 100% . In the encoding process, the research team added an error-correcting coding scheme and adopted a coding method of overlapping short oligonucleotides that can be identified by sequence. Since then, research teams such as Columbia University, Washington University, and Imperial College have carried out a series of studies.

In order to prove the long-term stability of DNA encoding data, on February 4, 2015, researchers from ETH Zurich published a related paper in the top international journal Angewandte Chemie International Edition. The researchers used Reed-Solomon error correction coding and sol, The gel encapsulates DNA in silica glass spheres to increase redundancy, and this may be the earliest form of DNA memory chips.

Since November 2021, multiple research teams have announced new developments in DNA memory chip research, including research teams from Southeast University in my country, Microsoft Research, Northwestern University in Illinois, and Georgia Institute of Technology. On November 12th, Liu Hong’s team from the School of Biosciences and Medical Engineering and the State Key Laboratory of Bioelectronics of Southeast University in my country successfully stored the school motto “Stop with the best” in a DNA sequence. The paper was published in Science Advances.

In order to realize the miniaturization, integration, and automation of DNA storage, the research team optimized the sequencing process. Based on electrochemical single-electrode DNA synthesis and sequencing methods, the traditional phosphoramidite chemical synthesis method is improved through electrochemical deprotection technology, and the DNA molecules on the electrode surface are sequenced based on the charge oscillation phenomenon, and the school motto is successfully encoded and decoded.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲The flow chart of the DNA data storage system of Liu Hong’s team based on electrochemical DNA synthesis and sequencing (picture source is the official website of Southeast University)

On November 24th, Microsoft Research and University of Washington Molecular Information Systems Laboratory (MISL) collaborated on a breakthrough paper on DNA storage and was published on Science Advances. The research team announced the first nano-scale DNA storage writer, DNA chip. The above molecular controller and DNA writing are equipped with PCIe interface, which can construct four strands of synthetic DNA at one time to produce a DNA chain containing 100 bases. According to Microsoft Research, longer DNA strands are prone to errors, but as hardware develops, this will be improved.

This experiment proved the possibility of the DNA spiral structure to expand the storage scale. On November 29 this year, the Center for Synthetic Biology of Northwestern University, Illinois, proposed a new method of recording information into DNA and published it in the journal “Genomics Research (Technology Networks)”. In the coding link, they tried to use the ability of DNA itself. Create a new data storage solution.

During the experiment, they used a new enzymatic system to synthesize DNA, recording rapidly changing environmental signals directly into the DNA sequence. Keith EJ Tyo, a professor of engineering at Northwestern University, said that by directly controlling the enzymes that synthesize DNA, it is possible to express and continuously store information in advance. In order to enable DNA data storage to reduce costs while expanding the storage scale, on December 1, Georgia Institute of Technology (GTRI) senior research scientist Nicholas Guise (Nicholas Guise) said in an interview with the foreign media British Broadcasting Corporation (BBC) “The functional density on our new chip is approximately 100 times higher than that of current commercial equipment.” The chip they designed can increase the DNA chain through an ultra-dense format at a very low cost and obtain large-scale storage capacity.

This microchip is equipped with 10 groups of “micropores” hundreds of nanometers deep, allowing DNA molecules to grow in parallel in the middle, eventually accumulating millions of DNA sequences on the chip. Compared with the traditional synthetic DNA manufacturing process, this method uses electrochemical local activation to synthesize, and the cost is lower.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲The experimental encoding and decoding process of the Georgia Institute of Technology (GTRI) research group (picture source is the paper illustration)

03. It costs $7,000 to synthesize 2MB and $2,000 to read

Continuous research has shown that DNA storage technology will become a cross-age storage method. However, since it was proposed in the 1950s, there has been no significant substantive progress in its development. As an early entrant of DNA data storage, Microsoft Research began to conduct related research in 2015 and did not progress until 2019. They demonstrated a fully automated system to encode and decode data information in DNA. DNA memory chips can achieve high-density and long-term storage characteristics, but this technology is not yet widely used in the computer field. Currently, it is mainly for content that is not commonly used but needs to be preserved.

There are probably several reasons why DNA memory chips cannot be commercialized. First, the writing and reading costs of DNA storage data are high. An experiment at Columbia University in 2017 showed that it costs US$7,000 to synthesize 2MB of DNA data and US$2,000 to read the data. Although this has been greatly reduced from the cost of US$12,400 per trillion in 2013, if users need to store it in the form of DNA For a 1GB movie, encoding would cost approximately US$3.58 million, and reading the data would also cost US$1.02 million. Second, the decoding process of DNA stored data requires large tools. At present, the decoding process of DNA storage technology still needs to rely on sequencers to sort DNA molecules. Most of the mass-produced sequencers on the market are used in small laboratories, clinical applications and other scenarios with high timeliness requirements, which are far from daily use. Far.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲Sequencer product iSeq 100 of Illumina, a sequencing service provider (picture source is Illumina official website)

In addition, the read and write speed of DNA storage technology is slow. In early December 2021, the Georgia Institute of Technology’s research increased the DNA storage speed to 20GB of data written per day, and the current read and write speed of solid-state drives is about 500MB per second. IDC’s “Data Age 2025” report shows that the world’s annual data will reach 175ZB in 2025, which is equivalent to 491EB of data generated every day. Even if the density of the DNA memory chip is large enough, its real-time reading speed cannot meet the current data storage requirements. DNA memory chips are an ideal medium for large-capacity storage in the future. Most of the current research progress is in the proof-of-concept stage, and the implementation of its hardware devices will take a long time.

04. Conclusion: The key to commercialization of DNA storage is to achieve low cost and high density

The advantages of high storage density, high stability, and easy maintenance of the DNA memory chip determine the possibility of it becoming a next-generation storage device. However, the further commercialization of this technology has many limitations, such as high cost, more storage environment restrictions, slow real-time reading speed, etc., all of which indicate that it has a long way to go before becoming a mainstream storage device.

We are in the digital age, and a lot of information is generated every day from smart phones, tablets, PCs to wearable devices. Therefore, this reality determines that it is urgent to find storage devices with higher performance requirements and lower cost.

The half-life of DNA is 521 years. Under a cold or suitable condition, DNA can continue to exist for hundreds of thousands or even millions of years. If DNA storage technology is truly commercialized, in the future, our data archives may become The “fossils” survived.

Author: Core Things

In the future, can DNA memory chips replace traditional storage hard drives?

We are in an era of data explosion, and the amount of global data is growing exponentially! International data agency IDC predicts that by 2025, the global data volume will reach 175ZB, with a 5-year compound growth rate of 8%. 1ZB is equal to 1 trillion GB. If 175ZB data is loaded with a mobile hard drive with a capacity of 1GB, at least 175 trillion hard drives are required.

In the future, the problem of data storage will become a pain point in the development of the Internet. In order to solve the problem of data storage, inspired by biology, researchers targeted the DNA in the human body. The largest human chromosome contains nearly 250 million base pairs. If data can be stored on each base pair, in theory, MIT Bioengineering Professor Mark Bathe said, a coffee cup full of DNA can store the world All the data on it.

In this way, storing 175ZB of data is no problem. Such a promising emerging storage technology was written into the draft of the “14th Five-Year Plan” in March this year. Not only that, the endless related research and implementation progress in 2021 will make DNA storage technology more and more attention.

For example, on January 11, the Nature sub-Journal published a paper on Columbia University’s translation of hello world into base language into E. coli DNA; on May 26, Zhongke Carbon Yuan, which was incubated by the Shenzhen Institute of Advanced Technology of the Chinese Academy of Sciences, was established. Focus on advancing the R&D and commercialization of DNA data storage; On November 12, Liu Hong’s team from Southeast University published a paper on the school motto “Stop with the best” into DNA in Science Advances; On November 24, Microsoft announced the first nanoscale DNA storage writer…It should be noted that in a broad sense, DNA chips are tools for genomics and genetics research, which refer to the in-situ synthesis of oligonucleotides on a solid support or the direct integration of a large number of pre-prepared DNA probes. The needle is solidified on the surface of the support in an orderly manner by microprinting, and then hybridized with the marked sample. Because the surface of the support is often a computer chip, it is called a DNA chip.

There are many types of DNA chips, including those used to detect genes and chromosomes or for clinical diagnosis, and the one that mimics the structure of DNA for data storage is the focus of our discussion today, that is, DNA storage chips.

01. The base corresponds to the binary system, and the long DNA chain of a human hand can store 1 billion G data

From the patterns carved on ancient stone walls to the appearance of words, to the production of books, the most important information carrier, we actually don’t produce much information. But since entering the information age, the information recorded by mankind in the past 50 years has far exceeded the information recorded in the past 2000 years. We are in the big data era of information explosion. All information on the Internet is saved as data, from web pages and applications to security and satellite fields.

According to data from the International Data Organization IDC, from 2013 to 2015, the global big data storage volume was 4.3ZB, 6.6ZB, and 8.6ZB respectively, and the growth rate remained at about 40%. By 2016, the global big data storage volume reached 16.1ZB. The growth rate reached 87.21%. From 2017 to 2019, the global big data storage volume was 21.6ZB, 33ZB, and 41ZB respectively, and the global data volume reached 60ZB in 2020. While the field of big data continues to develop, in order to meet the demand for massive data storage, storage methods are constantly changing.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲IDC monitors the trend of global data volume changes from 2015 to 2020 and forecasts for 2025

DNA is a carrier for storing genetic information. It carries the genetic information necessary to synthesize RNA and protein. It can encode all biological information. In the 1950s, researchers discovered the relationship between biological features and man-made objects. The DNA molecule is composed of four bases, and the data is composed of binary 0 and 1. DNA is used to store genetic information, and the data just needs a medium to store, so the Soviet physicist Mikhail Samoylovich Nei Man (Mikhail Samoilovich Neiman) thought, is it possible to store data by referring to the DNA structure?

Different from traditional storage media, DNA storage technology has the following significant advantages. The first is the high storage density of DNA. A DNA molecule can retain all the genetic information of a species. The largest human chromosome contains nearly 250 million base pairs, which means that a DNA strand about the same length as a human hand can store 1 EB (1EB = 174 million G) data. .

Compared with the data storage density of hard disk and flash memory, hard disk storage is about 1013 bits per cubic centimeter, flash memory storage is about 1016 bits, and DNA storage density is about 1019 bits. The second is the stability of DNA molecule storage. In February of this year, a paper in the top international academic journal Nature stated that paleontologists had extracted the genetic material of mammoths 1.2 million years ago from the permafrost in northeastern Siberia and analyzed their DNA. It also further refreshed the age record of DNA molecules.

It is reported that DNA can retain data for at least a hundred years. In contrast, data on hard drives and tapes can only be retained for about 10 years. Finally, DNA storage and maintenance costs are low. Data stored in the form of DNA is easy to maintain. Unlike traditional data centers, it does not require a lot of manpower and financial investment, and only needs to be stored in a low-temperature environment. In terms of energy consumption, the energy consumption of 1GB data hard disk storage is about 0.04W, while the energy consumption of DNA storage is less than 10-10W.

02. Low-cost expansion can place millions of DNA sequences

In the 1950s, scientists had proposed the idea of ​​creating man-made objects similar to the biological characteristics of the micro-world, and believed that the man-made objects would have more extensive capabilities. In less than ten years, Soviet physicist Mikhail Samoilovich Neiman independently proposed the possibility of using DNA and RNA molecules to record, store and retrieve information.

The application of DNA for data storage really started in 1988. The artist Joe Davis and Harvard University researchers collaborated in the DNA sequence of Escherichia coli and put a picture of ancient Germanic runes representing life and the female earth through 5×7. The matrix is ​​stored in the DNA sequence. They use 1 in the binary system to represent the dark pixels in the picture, and 0 to represent the bright pixels in the picture. In subsequent studies, the researchers proposed a variety of encoding methods for DNA storage.

In 2011, the research team coded a 659KB book. Through one-to-one correspondence, adenine or cytosine represents 0 in binary, and guanine or thymine represents 1. However, when the researchers finally checked the data storage results, they found that there were 22 errors in the DNA. The accuracy of this one-to-one corresponding encoding method is low. DNA is composed of four bases combined into base pairs, which form a spiral structure. The four bases are adenine (A), thymine (T), guanine (G), and cytosine (C). Then, according to the principle of complementary base pairing, DNA molecules are arranged to store genetic information. These four codes also provide a suitable coding environment for the DNA memory chip.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲Schematic diagram of DNA molecular structure

DNA storage technology includes four steps of information encoding, storage, retrieval, and decoding. In a computer, data storage needs to be represented by binary 0 and 1. To use DNA to store data, you first need to convert 0 and 1 into the four bases A, C, T, and G in DNA to create a base sequence with the correct base sequence. DNA spiral structure. The synthesized DNA is stored in vivo or in vitro.

When decoding, the DNA sequencer will transcribe the base sequence in the DNA structure and convert it to 0 and 1 through the decoding software to restore the data information. In 2012, the Harvard University research team confirmed that DNA can be used as a storage medium similar to hard drives and tapes. They use DNA to encode digital information, including 53400-byte HTML drafts, 11 JPG pictures and a JavaScript program, using bit-to-base one-to-one mapping, but this method will make the same base run for a long time and sequence The process is prone to errors.

This simple one-to-one encoding form achieved a breakthrough in 2013. Researchers from the European Institute of Bioinformatics (EBI) stated in the paper that they have achieved the storage, retrieval and reproduction of more than 5 million bits of data, and all DNA files have reproduced the information with an accuracy of 99.99% to 100% . In the encoding process, the research team added an error-correcting coding scheme and adopted a coding method of overlapping short oligonucleotides that can be identified by sequence. Since then, research teams such as Columbia University, Washington University, and Imperial College have carried out a series of studies.

In order to prove the long-term stability of DNA encoding data, on February 4, 2015, researchers from ETH Zurich published a related paper in the top international journal Angewandte Chemie International Edition. The researchers used Reed-Solomon error correction coding and sol, The gel encapsulates DNA in silica glass spheres to increase redundancy, and this may be the earliest form of DNA memory chips.

Since November 2021, multiple research teams have announced new developments in DNA memory chip research, including research teams from Southeast University in my country, Microsoft Research, Northwestern University in Illinois, and Georgia Institute of Technology. On November 12th, Liu Hong’s team from the School of Biosciences and Medical Engineering and the State Key Laboratory of Bioelectronics of Southeast University in my country successfully stored the school motto “Stop with the best” in a DNA sequence. The paper was published in Science Advances.

In order to realize the miniaturization, integration, and automation of DNA storage, the research team optimized the sequencing process. Based on electrochemical single-electrode DNA synthesis and sequencing methods, the traditional phosphoramidite chemical synthesis method is improved through electrochemical deprotection technology, and the DNA molecules on the electrode surface are sequenced based on the charge oscillation phenomenon, and the school motto is successfully encoded and decoded.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲The flow chart of the DNA data storage system of Liu Hong’s team based on electrochemical DNA synthesis and sequencing (picture source is the official website of Southeast University)

On November 24th, Microsoft Research and University of Washington Molecular Information Systems Laboratory (MISL) collaborated on a breakthrough paper on DNA storage and was published on Science Advances. The research team announced the first nano-scale DNA storage writer, DNA chip. The above molecular controller and DNA writing are equipped with PCIe interface, which can construct four strands of synthetic DNA at one time to produce a DNA chain containing 100 bases. According to Microsoft Research, longer DNA strands are prone to errors, but as hardware develops, this will be improved.

This experiment proved the possibility of the DNA spiral structure to expand the storage scale. On November 29 this year, the Center for Synthetic Biology of Northwestern University, Illinois, proposed a new method of recording information into DNA and published it in the journal “Genomics Research (Technology Networks)”. In the coding link, they tried to use the ability of DNA itself. Create a new data storage solution.

During the experiment, they used a new enzymatic system to synthesize DNA, recording rapidly changing environmental signals directly into the DNA sequence. Keith EJ Tyo, a professor of engineering at Northwestern University, said that by directly controlling the enzymes that synthesize DNA, it is possible to express and continuously store information in advance. In order to enable DNA data storage to reduce costs while expanding the storage scale, on December 1, Georgia Institute of Technology (GTRI) senior research scientist Nicholas Guise (Nicholas Guise) said in an interview with the foreign media British Broadcasting Corporation (BBC) “The functional density on our new chip is approximately 100 times higher than that of current commercial equipment.” The chip they designed can increase the DNA chain through an ultra-dense format at a very low cost and obtain large-scale storage capacity.

This microchip is equipped with 10 groups of “micropores” hundreds of nanometers deep, allowing DNA molecules to grow in parallel in the middle, eventually accumulating millions of DNA sequences on the chip. Compared with the traditional synthetic DNA manufacturing process, this method uses electrochemical local activation to synthesize, and the cost is lower.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲The experimental encoding and decoding process of the Georgia Institute of Technology (GTRI) research group (picture source is the paper illustration)

03. It costs $7,000 to synthesize 2MB and $2,000 to read

Continuous research has shown that DNA storage technology will become a cross-age storage method. However, since it was proposed in the 1950s, there has been no significant substantive progress in its development. As an early entrant of DNA data storage, Microsoft Research began to conduct related research in 2015 and did not progress until 2019. They demonstrated a fully automated system to encode and decode data information in DNA. DNA memory chips can achieve high-density and long-term storage characteristics, but this technology is not yet widely used in the computer field. Currently, it is mainly for content that is not commonly used but needs to be preserved.

There are probably several reasons why DNA memory chips cannot be commercialized. First, the writing and reading costs of DNA storage data are high. An experiment at Columbia University in 2017 showed that it costs US$7,000 to synthesize 2MB of DNA data and US$2,000 to read the data. Although this has been greatly reduced from the cost of US$12,400 per trillion in 2013, if users need to store it in the form of DNA For a 1GB movie, encoding would cost approximately US$3.58 million, and reading the data would also cost US$1.02 million. Second, the decoding process of DNA stored data requires large tools. At present, the decoding process of DNA storage technology still needs to rely on sequencers to sort DNA molecules. Most of the mass-produced sequencers on the market are used in small laboratories, clinical applications and other scenarios with high timeliness requirements, which are far from daily use. Far.

A coffee cup holds the world’s data, where is the magic of the DNA memory chip?
▲Sequencer product iSeq 100 of Illumina, a sequencing service provider (picture source is Illumina official website)

In addition, the read and write speed of DNA storage technology is slow. In early December 2021, the Georgia Institute of Technology’s research increased the DNA storage speed to 20GB of data written per day, and the current read and write speed of solid-state drives is about 500MB per second. IDC’s “Data Age 2025” report shows that the world’s annual data will reach 175ZB in 2025, which is equivalent to 491EB of data generated every day. Even if the density of the DNA memory chip is large enough, its real-time reading speed cannot meet the current data storage requirements. DNA memory chips are an ideal medium for large-capacity storage in the future. Most of the current research progress is in the proof-of-concept stage, and the implementation of its hardware devices will take a long time.

04. Conclusion: The key to commercialization of DNA storage is to achieve low cost and high density

The advantages of high storage density, high stability, and easy maintenance of the DNA memory chip determine the possibility of it becoming a next-generation storage device. However, the further commercialization of this technology has many limitations, such as high cost, more storage environment restrictions, slow real-time reading speed, etc., all of which indicate that it has a long way to go before becoming a mainstream storage device.

We are in the digital age, and a lot of information is generated every day from smart phones, tablets, PCs to wearable devices. Therefore, this reality determines that it is urgent to find storage devices with higher performance requirements and lower cost.

The half-life of DNA is 521 years. Under a cold or suitable condition, DNA can continue to exist for hundreds of thousands or even millions of years. If DNA storage technology is truly commercialized, in the future, our data archives may become The “fossils” survived.

The Links:   6MBP75RA120 SKIIP2403GB172-4DL