The Expanding (Digital) Universe: Visualizing How BIG a Zettabyte Really Is

A lot of news articles recently (Google News currently shows 1,060 articles) are citing the annual EMC-IDC Digital Universe studies of the massive growth of the digital universe through 2020. If you have not read the study, it indicates that the digital universe is now doubling every two years and will grow 44-fold 50-fold now 55-fold from 0.8 Zettabytes (ZB) of data in 2009 to 35 40 now 44 Zettabytes in 2020. (Every year IDC has revised the growth curve upward by several Zettabytes.)

Usually these articles show a diagram such as this:


This type of diagram is great at showing how much 44-fold growth is. However it really does not convey how big a Zettabyte really is—and how much data we will be swimming (or drowning in) by 2020.

A Zettabyte (ZB) is really, really big – in terms of today’s information systems. It is not a capacity that people encounter every day. It’s not even in Microsoft Office’s spell-checker, Word “recommended” that I meant to type “Petabyte” instead 😉

The Raw Definition: How big is a Zettabyte?

A Computer Scientist will tell you that 1 Zettabyte is 270 bytes. That does not sound very big to a person who does not usually visualize think in exponential or scientific notation—especially given that a one-Terabyte (1 TB) solid state drive has a capacity to store 240 bytes.

Wikipedia describes a ZB (in decimal math) as one-sextillion bytes. While this sounds large, it is a hard to visualize. It is easier to visualize 1 ZB (and 44 ZBs) in relation to things we use everyday.

Visualizing Zettabytes in Units of Smartphones

The most popular new smartphones today have 32 Gigabytes (GB) or 32 x 230 bytes of capacity. To get 1 ZB you would have to fill 34,359,738,368 (34.4 billion) smartphones to capacity. If you put 34.4 billion Samsung S5’s end-to-end (length-wise) you would circle the Earth 121.8 times:

Click to see a higher resolution image and the dot that represents Earth to-scale vs. the line

You can actually circumnavigate Jupiter almost 11 times—but that is not obvious to visualize.

The number of bytes in 44 Zettabytes is a number too large for Microsoft Excel to compute correctly. (The number you will get is so large that Excel will cut off seven digits of accuracy–read that as a potential rounding error up to one million bytes). Assuming that Moore’s Law will allow us to double the capacity of smartphones three times between now and 2020, it would take 188,978,561,024 (188+ trillion) smartphones to store 44 ZB of data. Placing these end-to-end- would circumnavigate the world over nearly 670 times.

This is too hard to visualize, so lets look at it another way. You could tile the entire City of New York two times over (and the Bronx and Manhattan three times over) with smartphones filled to capacity with data to store 44 ZBs. That’s a big Data Center!

Amount of Smartphones (with 2020 tech) you would need to store 44 ZB (click for higher resolution)

This number also represents 25 smartphones per person for the entire population of the planet. Imagine the challenge of managing data spread out across that many smartphones.

Next Page: Visualizing Zettabytes in Units of Facebook

2020 Challenge: Completely re-invent how we process data (or grow our brains 30x!)

matrix-200pxOn Friday, Russell Garland of the WSJ wrote about the “Data Tsunami” that is coming due to increased volumes of data being generated from everything from the Facebook Social Graph, the next Interest Graph and genomics (just to name the most obvious growth driver). “Tsunami” is probably too small of a word (unless you are talking about Jupiter-scale growth). Take a look at these interesting numbers:

  • The average human brain can take in and remember about one byte per second (two gigabytes over an average life time, including sleep)[1]
  • The amount of data storage in the world in 2000 was rough 300,000 terabytes—about 0.03 “brains’ worth” of storage for every person on Earth[1,2]
  • This amount grew at to approximately 1,200,000 terabytes by 2010—about 90 “brains’ worth” of storage for every person on Earth.[2,3] No wonder we feel so over-loaded with data!
  • By 2020, this will get even more outlandish. We will have 36,000,000,000 terabytes of data—about 2,400 “brains’ worth” of storage for every person on Earth.[2,3]

Managing storage of this volume data will be an interesting challenge for companies like EMC, IBM and Oracle (one aided greatly by Moore’s Law). However, being able to understand it will require complete reinvention of how we process, explore and analyze data.

These new technologies will be as advanced when compared to today’s data warehousing and reporting technologies as the spreadsheet was when compared to manual ledgers. They will use non-linear rule engines and artificial intelligence to find trends and determine which data are most important. They will use new data visualization techniques, leveraging everything from 3D to augmented reality (AR) technology to enable human-scale brains to explore results and conduct analyses. This, in turn, will drive new physical interfaces from the desktop to mobile to even wearables.

It should be a very interesting ride!

