Scientists at Harvard University have shown that a single DNA molecule can be written 643 KB of data, which means that one cubic millimeter of DNA can store 5.5 petabit, or about 70 billion books, and four grams of DNA may contain all the information created by mankind for the year, i.e. about 1.8 zettabyte (billion TB).
Deoxyribonucleic acid (DNA), which is essential for the functioning of all living organisms, has a large storage capacity. Based on this, scientists from Harvard Medical School have achieved encode the contents of a book following the genetic sequencing, adenine (A), cytosine (C), thymine (T) and guanine (G).
The team, using a special DNA chip, coded electronic version of the scientific books, a volume 5.27 Mb. Then, using the same chip, the researchers produced a procedure to read the recorded data, in which no one byte of data has been lost and distorted. The book, which was honored to be encoded in DNA, is directly related to the field of genetics. It’s called “Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves in DNA”, and its author is George Church, a professor of genetics at Harvard University Institute of Weiss.
“The device is the size of a thumb could store as much information as you are all over the web,” said the head of the project, George Church.
To read the data stored in DNA, you simply sequence it – just as if you were sequencing the human genome – and convert each of the TGAC bases back into binary. To aid with sequencing, each strand of DNA has a 19-bit address block at the start so a whole vat of DNA can be sequenced out of order, and then sorted into usable data using the addresses.
Scientists have been eyeing up DNA as a potential storage medium for a long time, for three very good reasons: It’s incredibly dense (you can store one bit per base, and a base is only a few atoms large); it’s volumetric rather than planar (like hard disk); and it’s incredibly stable – where other bleeding-edge storage mediums need to be kept in sub-zero vacuums, DNA can survive for hundreds of thousands of years in a box in your garage.
“It shows that the vast increase in capacity to synthesize and sequence DNA can be applied to store significant amounts of data,” said pioneering synthetic biologist Drew Endy at Stanford University. “If you wanted to have your library encoded in DNA, you could probably do that now.”
The book produced by Harvard scientists contains 53,426 words, 11 illustrations and a JavaScript computer program.
The prospects for the use of such a data carrier in the conventional sense like computer memory would require significant amount of work before it can used. The current technology of recording information by synthesizing DNA will not be in the next decade to provide enough data recording speed, and laboratory methods for DNA sequence analysis in the short term does not lead to its high-speed reading.
But looking forward, scientists foresee a world where biological storage would allow us to record anything and everything without reservation. DNA could become the storage system of the future, and thanks to its three-dimensional features it can hold a large amount of data and, unlike physical devices like CDs or USB sticks with great longevity.
A recent study by Pew Internet/Elon University revealed that the future evolution of big data to collect and analyze massive sets of information could lead big changes in business like real time forecasting of events, development of inferential software, and the creation of algorithms for advanced correlations.