A DNA data storage project, involving French biotech DNA Script, has been awarded a contract worth up to €20.7M from the US Intelligence Advanced Research Projects Activity agency to develop a prototype instrument able to store and retrieve 1 terabyte of information in 24 hours.
The Molecular Encoding Consortium, led by Robert Nicol at the Broad Institute and also involving DNA Script and Harvard University, aims to address the ‘data capacity gap’ in collaboration with Illumina.
DNA Script co-founder and CEO Sylvain Gariel told me: “Massive amounts of digital data are generated every day, and emerging technologies such as autonomous cars and artificial intelligence will further increase the need for data storage at unprecedented scales. It is expected that the anticipated growth in data storage requirements cannot be addressed by current resource-intensive technologies.”
The Molecular Encoding Consortium is therefore working on a nucleic acid-based system that can store this information with substantially reduced carbon footprints, power, and cost requirements.
At present, digital information is encoded as a string of zeros and ones. We know that DNA is a long chain of A, T, C and G molecules and that digital files can be translated from one code to another. These files could therefore be encoded in a molecule of DNA.
“This is attractive for two main reasons,” said Gariel. “First, nucleic acids are very dense: the information contained in a million-terabyte-scale data center, encoded in DNA, could fit into a shoe box. Second, when stored in appropriate conditions, DNA can be preserved over hundreds of years without using energy.”
The project will involve the use of DNA Script’s novel enzymatic approach to DNA synthesis, which is much faster than current chemical-based methods. DNA Script and other companies are working to speed up and improve the accuracy of DNA synthesis, a factor that many agree will strongly influence the roll-out of DNA data storage technology.
In the US, a team led by Donhee Ham at Harvard University will develop semiconductor chips on which the DNA strands for the project will be manufactured. The program will take place across two phases lasting for two years each. The first phase will aim to achieve 10 gigabyte of storage per day, while the second will involve scaling up the process to 1 terabyte.
Gariel believes that the greatest application for DNA data storage will be in archiving, also known as cold data storage, much like Amazon’s Glacier service.
“I don’t think DNA data storage will replace disk drives or RAM anytime soon, because even with amazing writing and reading speeds, it will be nearly impossible to compete with the speed of technologies currently in use. However, it could represent a great way to back up our data centers in a sustainable way and store the massive amounts of data that do not need immediate access,” he said.
Images via Shutterstock