Humanity is creating huge amounts of data every day, billions of emails and social media updates, new websites, documents, images, and scientific and commercial big data amounting to petabytes of storage needs and beyond. It is well recognised that nucleic acids, the RNA and DNA that encode the proteins needed to build living things are seemingly quite efficient in storing information and so taking inspiration from this realm, a team from India writes in the International Journal of Nano and Biomaterials how extended nucleic acid memory (NAM) might be the future of data storage technology.
By comparison, a computer hard disk has an information storage capacity of 10 to the 13 bits of data per cubic centimetre, that’s about 1.25 terabytes. NAM has the potential to store a million times that amount in the same volume, 1,250,000 terabytes, or 1250 petabytes, 1.25 exabytes. If we consider the information contained in the “big four” of the internet – Google, Amazon, Microsoft, and Facebook – that is the sum of all the data they have storable in a single cubic centimetre of NAM.
Saptarshi Biswas of the Department of Computer Science and Engineering, at the Meghnad Saha Institute of Technology, in Kolkata, India, and colleagues Subhrapratim Nath, Jamuna Kanta Sing, and Subir Kumar Sarkar of Jadavpur University have now developed a new encoding approach allowing them to talk of extended NAM. Their method efficiently maps binary data on to a hybrid system of standard as well as using non-standard genetic nucleotides (in addition to the familiar G, A, T, and C (guanosine, adenosine, thymine, and cytosine, of DNA) to achieve a higher data capacity. The natural pairing up of the GATC bases in DNA is what gives us the double-helix and allows information to be encoded for the production of proteins whether in a fungus, a bacterium, a rose, or a human being.
The team has added two new non-standard nucleotides, to give them additional pairings Ds-Px (thienylimidazopyridine and a nitropropynylpyrrole) and Im-Na (an imidazopyrimidine and a naphthyridine). These are very stable units to complement the pairings of A-T and C-G in a natural nucleic acid. They are also highly selective in such a molecule, specifically DNA. This could potentially take the hypothetical storage capacity of that single cubic centimetre of NAM to several times the 1.25 exabyte value mentioned above. Indeed, the team writes that extended RAM would have a capacity of more than 630 exabytes per gram of DNA, which assuming DNA has a density of 1.7 grams per cubic centimetre is more more than 370 exabytes per cubic centimetre of extended NAM. that’s almost 300 times the total information held by the big four of the internet today.
Biswas, S., Nath, S., Sing, J.K. and Sarkar, S.K. (2020) ‘Extended nucleic acid memory as the future of data storage technology’, Int. J. Nano and Biomaterials, Vol. 9, Nos. 1/2, pp.2–17.