As
technology advances and we incorporate digital activities into our daily
routine more frequently, we require an ever-increasing amount of storage space
to host the data we collect. Storage is evolving at a reasonable pace, but
conventional technologies have their limits and researchers are constantly
looking for more powerful alternatives. One such storage medium involves
holding enormous amounts of data on DNA. Now researchers have pioneered a new
technique to store data on, and access data from, DNA molecules.
Last
year, Harvard scientists managed to stuff 5.5 petabits (around 700 terabytes)
of data onto a single gram of DNA. As we previously explained, the method used
to store the data on the gram of DNA is similar to how it is stored on a
standard storage device. Strands of DNA that held 96 bits of binary dataeach were synthesized, then
the data could be read using a standard DNA sequencing process.
However,
there are a couple of hurdles in the way of advancing writing to and reading
from DNA. First, writing and reading errors are common, and are caused by
repeating letters encoding onto the strands of DNA. The other prominent issue
is that, currently, scientists can only create short strands of DNA, limiting
the overall space with which to work.
The
new method for storing data on and reading it from DNA, created by the
Bioinformatics Institute (BI), consists of breaking up the data into many
little fragments that overlap each other and go in either direction in order to
prevent repeating letters — 117 letters in each string. Along with that
specific arrangement, the coded data requires indexing information to dictate
where each fragment fits into the overall data. The new technique also required
a new coding method that reduced the possibility of repeating letters.
In
order to test the new technique, California-based Agilent Technologies offered
to store data on the strings of DNA. BI sent the Agilent team various files
encoded using the aforementioned method that would reduce errors, which
consisted of a .txt file of all of Shakespeare’s sonnets, a 26-second clip of
Martin Luther King Jr.’s “I Have a Dream” speech, a .jpeg of the Bioinformatics
Institute, a .pdf of Watson and Crick’s paper that detailed DNA structure, and
a file that explains the actual encoding process being used.
Agilent
downloaded those files from the internet and put the information on hundreds of
thousands of strings of DNA, which resulted in something the size of a rather
small piece of dust. Agilent then sent the encoded dust-like strings back to
BI, where researchers managed to sequence and reconstruct the files without
error.
BI
researcher Nick Goldman notes that the coding technique creates results in a storage medium
that can last for ten thousand years or more, and can be read by anyone so long
as they have access to a machine that can read DNA and what is essentially the
cipher to reconstruct the coding method.
Obviously, DNA USB drives aren’t right
around the corner, as various practical issues have to be overcome first, such
as, you know, not having two different research labs and all of the appropriate
equipment involved in the encoding and reconstruction process. However,
considering DNA will most likely never become outdated, and it has already been
shown to store massive amounts of data, we can only hope significant advances
in the field will be made quickly enough for us to see a DNA drive in our
lifetime.
Research paper: Towards practical, high-capacity, low-maintenance information storage in synthesized DNA
source: www.extremetech.com/computing/146600-new-technique-stores-terabytes-data-on-dna-with-100-accuracy
source: www.extremetech.com/computing/146600-new-technique-stores-terabytes-data-on-dna-with-100-accuracy
Yeah, but how fast are the read/ write speeds? ;)
ReplyDelete