聽力課堂TED音頻欄目主要包括TED演講的音頻MP3及中英雙語文稿,供各位英語愛好者學(xué)習(xí)使用。本文主要內(nèi)容為演講MP3+雙語文稿:我們?nèi)绾卧贒NA中存儲電子數(shù)據(jù),希望你會喜歡!
【演講人及介紹】Dina Zielinski
生物信息學(xué)家迪娜·齊林斯基將生物數(shù)據(jù)帶到了生活中,從解碼癌癥突變到編碼DNA數(shù)據(jù)。
【演講主題】我們?nèi)绾卧贒NA中存儲電子數(shù)據(jù)
【演講文稿-中英文】
翻譯 Xuying Wu 校對 SHUYING XU
00:12
I could fit all movies ever made inside ofthis tube. If you can't see it, that's kind of the point.
我可以把有史以來的所有電影裝進(jìn)這個(gè)小管里。如果你看不見它,那就對了。
00:20
(Laughter)
(大笑)
00:21
Before we understand how this is possible,it's important to understand the value of this feat. All of our thoughts andactions these days, through photos and videos -- even our fitness activities --are stored as digital data. Aside from running out of space on our phones, werarely think about our digital footprint. But humanity has collectivelygenerated more data in the last few years than all of preceding human history.
在我們理解這件事的可能性之前,重要的是先理解這項(xiàng)技術(shù)的價(jià)值?,F(xiàn)在我們所有的想法和行動(dòng),通過照片和視頻——甚至是我們的健身運(yùn)動(dòng)——都存儲在電子數(shù)據(jù)里。要不是它們超出我們手機(jī)的存儲空間,我們很少會去想我們到底存儲了多少電子數(shù)據(jù)。但是人類已經(jīng)攜手創(chuàng)造了更多的數(shù)據(jù),在過去的幾年中,這些數(shù)據(jù)比先前人類歷史所產(chǎn)生的還要多。
00:51
Big data has become a big problem. Digitalstorage is really expensive, and none of these devices that we have reallystand the test of time. There's this nonprofit website called the InternetArchive. In addition to free books and movies, you can access web pages as farback as 1996. Now, this is very tempting, but I decided to go back and look atthe TED website's very humble beginnings. As you can see, it's changed quite abit in the last 30 years. So this led me to the first-ever TED, back in 1984,and it just so happened to be a Sony executive explaining how a compact diskworks.
大數(shù)據(jù)已經(jīng)開始成為一個(gè)大問題。數(shù)字化存儲很昂貴,而且還沒有一個(gè)設(shè)備能真正經(jīng)得起時(shí)間的考驗(yàn)。有一個(gè)非營利網(wǎng)站,叫做“網(wǎng)絡(luò)檔案館”。除了免費(fèi)的書籍和電影,你還可以在上面找到 1996年以來的網(wǎng)頁。這可是非常誘人的,但我決定回過頭來,看看TED網(wǎng)站最初的樣子。可以看到,它在過去30年里改變了很多。這使我回憶起第一次的TED,回到1984,太巧了,正是索尼的主管在解釋一個(gè)簡單的磁盤是如何工作的。
01:37
(Laughter)
(大笑)
01:38
Now, it's really incredible to be able togo back in time and access this moment. It's also really fasting that after30 years, after that first TED, we're still talking about digital storage.
這是讓人難以置信的事,我們可以回到過去,并且與那個(gè)時(shí)刻緊密相連。這也是非常讓人著迷的事,在第一次TED演講過去的30年后,我們還在談?wù)撝鴶?shù)字化存儲。
01:54
Now, if we look back another 30 years, IBMreleased the first-ever hard drive back in 1956. Here it is being loaded forshipping in front of a small audience. It held the equivalent of one MP3 songand weighed over one ton. At 10,000 dollars a megabyte, I don't think anyone inthis room would be interested in buying this thing, except maybe as acollector's item. But it's the best we could do at the time.
如果我們回頭看看另外一個(gè)30年,1956年,IBM破天荒地發(fā)布了它的第一個(gè)硬盤驅(qū)動(dòng)器。這是它正在被裝載上車,一小群人在圍觀。它承載著一首MP3歌曲的內(nèi)容,卻重達(dá)一噸多。一兆字節(jié)價(jià)值1萬美元,我想這里不會有人有興趣要買它,除非可能作為一個(gè)收藏品。但這是我們在當(dāng)時(shí)最好的產(chǎn)品了。
02:26
We've come such a long way in data storage.Devices have evolved dramatically. But all media eventually wear out or becomeobsolete. If someone handed you a floppy drive today to back up yourpresentation, you'd probably look at them kind of strange, maybe laugh, but you'dhave no way to use the damn thing. These devices can no longer meet our storageneeds, although some of them can be repurposed. All technology eventually diesor is lost, along with our data, all of our memories. There's this illusionthat the storage problem has been solved, but really, we all just externalizeit. We don't worry about storing our emails and our photos. They're just in thecloud.
我們在數(shù)據(jù)存儲的路上走了很久。設(shè)備已經(jīng)顯著進(jìn)化了。但所有載體最終都會磨損或被廢棄。今天,如果有人遞給你一張軟盤驅(qū)動(dòng)器來備份你的演示文稿,你可能會奇怪地看著他們,可能還會大笑,但你肯定不會用這個(gè)落伍的東西。這些設(shè)備已經(jīng)不能再滿足我們的存儲需求了,雖然它們有些可以被改作其他用途。所有的科技最終都會死亡或者消失,和我們的數(shù)據(jù)一起,包含我們所有的記憶。有一種錯(cuò)覺認(rèn)為存儲問題已經(jīng)解決了,但實(shí)際上,我們只是把它外化了。我們不擔(dān)心郵件和照片的存儲。它們都在云端上。
03:15
But behind the scenes, storage isproblematic. After all, the cloud is just a lot of hard drives. Now, mostdigital data, we could argue, is not really critical. Surely, we could justdelete it. But how can we really know what's important today? We've learned somuch about human history from drawings and writings in caves, from stonetablets. We've deciphered languages from the Rosetta Stone. You know, we'llnever really have the whole story, though. Our data is our story, even more sotoday. We won't have our record recorded on stone tablets. But we don't have tochoose what is important now. There's a way to store it all. It turns out thatthere's a solution that's been around for a few billion years, and it'sactually in this tube.
但是在這些場景背后,存儲問題依然存在。畢竟,云端只是許多硬盤組成的。我們認(rèn)為大部分電子數(shù)據(jù)都不重要。當(dāng)然,我們還可以輕易地刪除這些數(shù)據(jù)。但是今天的我們怎么知道到底什么是重要的?我們從人類歷史中得到了很多信息,從洞穴里的壁畫和文字,還有石碑。我們破譯了羅塞塔石碑上的語言。盡管我們還遠(yuǎn)沒有了解整個(gè)故事。我們的數(shù)據(jù)就是我們的故事,這在今天更是這樣。我們不再將記錄刻在石碑上。我們現(xiàn)在也不需要去選擇什么是重要的。有一種方法可以存儲所有信息。我們發(fā)現(xiàn),這種解決方案已經(jīng)存在了數(shù)十億年。它實(shí)際上就在這個(gè)小管里。
04:12
DNA is nature's oldest storage device.After all, it contains all the information necessary to build and maintain ahuman being. But what makes DNA so great? Well, let's take our own genome as anexample. If we were to print out all three billion A's, T's, C's and G's on astandard font, standard format, and then we were to stack all of those papers,it would be about 130 meters high, somewhere between the Statue of Liberty andthe Washington Monument. Now, if we converted all those A's, T's, C's and G'sto digital data, to zeroes and ones, it would total a few gigs. And that's ineach cell of our body. We have more than 30 trillion cells. You get the idea:DNA can store a ton of information in a minuscule space.
DNA是大自然最古老的存儲設(shè)備。畢竟,它保存著構(gòu)建和 維持一個(gè)人生命的 所有必要的信息。然而,DNA為何如此強(qiáng)大? 讓我們來看看 人類的基因組,如果我們將所有30億個(gè)A(腺嘌呤),T(胸腺嘧啶),C(胞嘧啶) 和 G(鳥嘌呤),以標(biāo)準(zhǔn)字體,標(biāo)準(zhǔn)格式打印出來,然后我們把所有紙張疊起來,大概會有130米高,介于自由女神像和華盛頓紀(jì)念碑的高度之間。如果我們將所有這些A,T,C和G,轉(zhuǎn)換為電子數(shù)據(jù),0和1,這不過是幾場演奏會的事。這會發(fā)生在我們身體的每個(gè)細(xì)胞中。我們有超過30萬億的細(xì)胞。估計(jì)你們已經(jīng)想明白了: DNA可以在一個(gè)微小的空間存儲大量信息。
05:07
DNA is also very durable, and it doesn'teven require electricity to store it. We know this because scientists haverecovered DNA from ancient humans that lived hundreds of thousands of yearsago. One of those is Ötzi the Iceman. Turns out, he's Austrian.
DNA也是持久耐用的,它甚至不需要供電來儲存信息。我們知道這些,是因?yàn)榭茖W(xué)家 已經(jīng)從生活在千萬年前的 遠(yuǎn)古人類身上復(fù)原了DNA。其中一個(gè)是Ötzi冰人。他是奧地利人。
05:24
(Laughter)
(大笑)
05:25
He was found high, well-preserved, in themountains between Italy and Austria, and it turns out that he has livinggenetic relatives here in Austria today. So one of you could be a cousin ofÖtzi.
他被發(fā)現(xiàn)時(shí)正完整的保存在意大利和奧地利之間的山中,證明他和現(xiàn)在的奧地利人有基因關(guān)系。所以你們其中有人可能是Ötzi的表親。
05:36
(Laughter)
(大笑)
05:38
The point is that we have a better chanceof recovering information from an ancient human than we do from an old phone.It's also much less likely that we'll lose the ability to read DNA than anysingle man-made device. Every single new storage format requires a new way toread it. We'll always be able to read DNA. If we can no longer sequence, wehave bigger problems than worrying about data storage.
其中的關(guān)鍵是,我們擁有更好的從一個(gè)遠(yuǎn)古人類身上修復(fù)信息的機(jī)會,比從一臺老電話上獲得的更多。同時(shí),相較于任何一種人造的設(shè)備,我們不太可能失去解讀DNA的能力。每一種新的存儲格式都要求一種新的解讀方式。而我們將一直保持解讀DNA的能力。如果有一天我們不能夠進(jìn)行基因排序,那問題可比數(shù)據(jù)存儲更令人擔(dān)憂。
06:05
Storing data on DNA is not new. Nature'sbeen doing it for several billion years. In fact, every living thing is a DNAstorage device. But how do we store data on DNA? This is Photo 51. It's thefirst-ever photo of DNA, taken about 60 years ago. This is around the time thatthat same hard drive was released by IBM. So really, our understanding ofdigital storage and of DNA have coevolved. We first learned to sequence, orread DNA, and very soon after, how to write it, or synthesize it. This is muchlike how we learn a new language. And now we have the ability to read, writeand copy DNA. We do it in the lab all the time. So anything, really anything,that can be stored as zeroes and ones can be stored in DNA.
在DNA中存儲數(shù)據(jù)不是新鮮事。大自然在數(shù)十億年中一直這么做。事實(shí)上,每一個(gè)生物都是一個(gè)DNA存儲設(shè)備。但是我們怎么把數(shù)據(jù)存儲進(jìn)DNA呢?這是照片51。這是第一張DNA的照片,拍攝于大約60年前。也是大約這個(gè)時(shí)間,IBM發(fā)布了硬盤驅(qū)動(dòng)器??梢哉f,我們對數(shù)字化存儲的理解和我們對基因的理解是在同步進(jìn)化的。我們最開始是學(xué)習(xí)測序,或者解讀DNA,之后很快也學(xué)會了如何編輯它,或者合成它。這很像如何學(xué)習(xí)一門新語言。而現(xiàn)在我們有能力閱讀、編輯和復(fù)制DNA。我們一直在實(shí)驗(yàn)室里這么做。所以,毫不夸張的說,任何東西可以以 0 和 1的形式存儲在DNA中。
07:02
To store something digitally, like thisphoto, we convert it to bits, or binary digits. Each pixel in a black-and-whitephoto is simply a zero or a one. And we can write DNA much like an inkjetprinter can print letters on a page. We just have to convert our data, all ofthose zeroes and ones, to A's, T's, C's and G's, and then we send this to asynthesis company. So we write it, we can store it, and when we want to recoverour data, we just sequence it.
要以數(shù)字化的方式存儲某些內(nèi)容,比如這張照片,我們要先把它轉(zhuǎn)換為比特,或者二進(jìn)制數(shù)字。黑白照片中的每個(gè)像素就代表一個(gè) 0 或 1。我們可以像噴墨打印機(jī)打字一樣書寫DNA。我們只要將數(shù)據(jù),所有這些 0 和1,轉(zhuǎn)換為 A,T,C,G,然后將它們發(fā)送到合成公司。這樣一來,我們既可以書寫,也可以存儲,當(dāng)我們想要恢復(fù)數(shù)據(jù),只需要測序就好。
07:32
Now, the fun part of all of this isdeciding what files to include. We're serious scientists, so we had to includea manuscript for good posterity. We also included a $50 Amazon gift card --don't get too excited, it's already been spent, someone decoded it -- as wellas an operating system, one of the first movies ever made and a Pioneer plaque.Some of you might have seen this. It has a depiction of a typical -- apparently-- male and female, and our approximate location in the Solar System, in casethe Pioneer spacecraft ever encounters extraterrestrials.
有意思的部分是決定要包含哪些文件。我們是嚴(yán)肅的科學(xué)家,所以我們必須留下一份手稿給我們優(yōu)秀的后代。我們還放入了一份價(jià)值50美元的亞馬遜禮卡——?jiǎng)e激動(dòng),里面的余額已經(jīng)被移除了——還有一個(gè)操作系統(tǒng),人類制作的第一部電影,和一個(gè)“先驅(qū)者號”金屬板。你們中可能有人見過它。它包含了代表性的信息,——顯然,包括男女性別,還有我們在太陽系中的大致位置,以防萬一“先驅(qū)者號”太空飛船遇見了外星人。
08:06
So once we decided what sort of files wewant to encode, we package up the data, convert those zeroes and ones to A's,T's, C's and G's, and then we just send this file off to a synthesis company.And this is what we got back. Our files were in this tube. All we had to do wassequence it. This all sounds pretty straightforward, but the difference betweena really cool, fun idea and something we can actually use is overcoming thesepractical challenges.
一旦我們決定了哪些類型的文件要編碼,就可以把這些數(shù)據(jù)打包,將這些 0 和 1 轉(zhuǎn)換為 A,T,C,G,然后將這個(gè)文件發(fā)送到合成公司。而這,就是我們拿回來的東西。我們的文件就在這個(gè)小管里。我們只需要對它進(jìn)行測序就可以解讀其中的信息。這聽起來真的很簡單,但一個(gè)很酷、很有趣的想法,與我們實(shí)際運(yùn)用之間的不同之處,在于戰(zhàn)勝實(shí)際的挑戰(zhàn)。
08:35
Now, while DNA is more robust than anyman-made device, it's not perfect. It does have some weaknesses. We recover ourmessage by sequencing the DNA, and every time data is retrieved, we lose theDNA. That's just part of the sequencing process. We don't want to run out ofdata, but luckily, there's a way to copy the DNA that's even cheaper and easierthan synthesizing it. We actually tested a way to make 200 trillion copies ofour files, and we recovered all the data without error. So sequencing alsointroduces errors into our DNA, into the A's, T's, C's and G's. Nature has away to deal with this in our cells. But our data is stored in synthetic DNA ina tube, so we had to find our own way to overcome this problem. We decided touse an algorithm that was used to stream videos. When you're streaming a video,you're essentially trying to recover the original video, the original file.When we're trying to recover our original files, we're simply sequencing. Butreally, both of these processes are about recovering enough zeroes and ones toput our data back together. And so, because of our coding strategy, we wereable to package up all of our data in a way that allowed us to make millionsand trillions of copies and still always recover all of our files back.
而DNA雖然比任何人造設(shè)備更穩(wěn)定,但它并不是完美的。它也有一些弱點(diǎn)。我們可以通過DNA測序來恢復(fù)信息,但每次數(shù)據(jù)找回,這個(gè)DNA都會被破壞。這只是測序過程的必要步驟。我們不想把數(shù)據(jù)耗盡,不過好在還有一種方法可以復(fù)制DNA,甚至比合成更便宜,更容易。我們測試了這種方法,將我們的文件復(fù)制了200萬億份,并精準(zhǔn)的還原了所有數(shù)據(jù)。測序也會將誤差引入DNA,引入 A,T,C,G 中。大自然有辦法在細(xì)胞中處理這個(gè)問題。但我們的數(shù)據(jù)是存儲在小管里的合成DNA中,所以我們必須找到自己的方法來解決這個(gè)問題。我們決定使用傳輸視頻時(shí)用到的算法。當(dāng)你在傳輸視頻時(shí),你實(shí)際上是在設(shè)法恢復(fù)原始的視頻,原始文件。當(dāng)我們在設(shè)法恢復(fù)原始文件時(shí),我們只是在測序。但實(shí)際上,這兩個(gè)過程都是在復(fù)原足夠的 0 和1。將數(shù)據(jù)重新整合在一起。所以,根據(jù)我們的編碼策略,我們能夠以一種可以制造上萬億份拷貝的方式,將所有數(shù)據(jù)打包,同時(shí)仍然保證所有的文件可以復(fù)原。
10:04
This is the movie we encoded. It's one ofthe first movies ever made, and now the first to be copied more than 200trillion times on DNA.
這是我們編碼的電影,是人類創(chuàng)作的首批電影之一,也是第一個(gè)在DNA中被復(fù)制出超過200萬億份拷貝的電影。
10:14
Soon after our work was published, weparticipated in an "Ask Me Anything" on the website reddit. If you'rea fellow nerd, you're very familiar with this website. Most questions werethoughtful. Some were comical. For example, one user wanted to know when wewould have a literal thumb drive. Now, the thing is, our DNA already storeseverything needed to make us who we are. It's a lot safer to store data on DNAin synthetic DNA in a tube.
很快我們的工作被公開發(fā)表,我們在Reddit網(wǎng)站上參與了“問我任何問題”的活動(dòng)。如果你是一個(gè)資深學(xué)究,你應(yīng)該對這個(gè)網(wǎng)站不會陌生。大部分問題都有很深的思考,也有一些問題很好笑。比如,一個(gè)用戶想知道我們什么時(shí)候會擁有一個(gè)字面意義的拇指儲存器。事實(shí)上,我們的DNA已經(jīng)存儲了所有塑造了我們的必要信息。將數(shù)據(jù)存儲在DNA中,比在小管中合成DNA要安全得多。
10:46
Writing and reading data from DNA isobviously a lot more time-consuming than just saving all your files on a harddrive -- for now. So initially, we should focus on long-term storage. Most dataare ephemeral. It's really hard to grasp what's important today, or what willbe important for future generations. But the point is, we don't have to decidetoday. There's this great program by UNESCO called the "Memory of theWorld" program. It's been created to preserve historical materials thatare considered of value to all of humanity. Items are nominated to be added tothe collection, including that film that we encoded. While a wonderful way topreserve human heritage, it doesn't have to be a choice. Instead of asking thecurrent generation -- us -- what might be important in the future, we couldstore everything in DNA.
在DNA中寫入和讀取數(shù)據(jù),明顯比在硬盤中存儲文件更花時(shí)間—— 目前是這樣。所以,我們首先應(yīng)該 關(guān)注長期存儲的問題。大部分?jǐn)?shù)據(jù)只能保存一段時(shí)間。目前還很難提煉出哪些信息是重要的,或者哪些對后人是重要的。但重點(diǎn)是,我們不一定要馬上做決定。聯(lián)合國教科文組織有一個(gè)叫做“世界的記憶”的項(xiàng)目,建立這個(gè)項(xiàng)目的初衷是保存歷史的記憶,那些對全人類都有價(jià)值的記憶。被選中的信息會被加入集合中,包括我們編譯的那部電影。而保存人類傳統(tǒng)更好的方式,不是必須做一個(gè)選擇。與其問我們這一代人,在未來什么東西可能是重要的,我們可以在DNA中存儲一切。
11:47
Storage is not just about how many bytesbut how well we can actually store the data and recover it. There's always beenthis tension between how much data we can generate and how much we can recoverand how much we can store. Every advance in writing data has required a new wayto read it. We can no longer read old media. How many of you even have a diskdrive in your laptop, never mind a floppy drive? This will never be the casewith DNA. As long as we're around, DNA is around, and we'll find a way tosequence it.
存儲不止是關(guān)乎有多少字節(jié),而是我們可以多好地保存和恢復(fù)數(shù)據(jù)。一直以來,在我們會產(chǎn)生多少數(shù)據(jù),可以恢復(fù)多少數(shù)據(jù),以及可以存儲多少數(shù)據(jù)之間,都存在著矛盾。數(shù)據(jù)寫入的每次進(jìn)步,都要求一種新的讀取方式。我們已無法再讀取那些老舊的存儲設(shè)備了。你們還有多少人的筆記本電腦中有磁盤驅(qū)動(dòng)器,或者軟盤驅(qū)動(dòng)器?有了DNA,這些情況再也不會出現(xiàn)。只要我們在,DNA就存在,我們總會找到排序的方式。
12:23
Archiving the world around us is part ofhuman nature. This is the progress we've made in digital storage in 60 years,at a time when we were only beginning to understand DNA. Yet, we've madesimilar progress in half that time with DNA sequencers, and as long as we'rearound, DNA will never be obsolete.
將我們周圍的世界存檔是人類天性的一部分。這是過去60年我們數(shù)字化存儲的發(fā)展,60年前我們也剛剛開始理解DNA。而有了DNA測序技術(shù),我們用一半的時(shí)間就達(dá)到了相似的發(fā)展進(jìn)度,而且只要我們還存在,DNA就永不過時(shí)。
12:46
Thank you.
謝謝。
12:47
(Applause)
(鼓掌)
瘋狂英語 英語語法 新概念英語 走遍美國 四級聽力 英語音標(biāo) 英語入門 發(fā)音 美語 四級 新東方 七年級 賴世雄 zero是什么意思宿州市光華園小區(qū)英語學(xué)習(xí)交流群