The inescapable truth about big data, the thing you must plan for, is that it just keeps getting bigger. As transactions, electronic records, and images flow in by the millions, terabytes grow into petabytes, which swell into exabytes. Next come zettabytes and, beyond those, yottabytes.
A yottabyte is a billion petabytes. Most calculators can’t even display a number of that size, yet the federal government’s most ambitious research efforts are already moving in that direction. In April, the White House announced a new scientific program, called the Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative, to “map” the human brain. Francis Collins, the director of the National Institutes of Health, said the project, which was launched with $100 million in initial funding, could eventually entail yottabytes of data.
And earlier this year, the US Department of Defense solicited bids for up to 4 exabytes of storage, to be used for image files generated by satellites and drones. That’s right—4 exabytes! The contract award has been put on hold temporarily as the Pentagon weighs its options, but the request for proposals is a sign of where things are heading.
Businesses also are racing to capitalize on the vast amounts of data they’re generating from internal operations, customer interactions, and many other sources that, when analyzed, provide actionable insights. An important first step in scoping out these big data projects is to calculate how much data you’ve got—then multiply by a thousand.
If you think I’m exaggerating, I’m not. It’s easy to underestimate just how much data is really pouring into your company. Businesses are collecting more data, new types of data, and bulkier data, and it’s coming from new and unforeseen sources. Before you know it, your company’s all-encompassing data store isn’t just two or three times what it had been; it’s a hundred times more, then a thousand.
Not that long ago, the benchmark for databases was a terabyte, or a trillion bytes. Say you had a 1 terabyte database and it doubled in size every year—a robust growth rate, but not unheard of these days. That system would exceed a petabyte (a thousand terabytes) in 10 years.
And many businesses are accumulating data even faster. For example, data is doubling every six months at Novation, a healthcare supply contracting company, according to Alex Latham, the company’s vice president of e-business and systems development. Novation has deployed Oracle ORCL +0.6% Exadata Database Machine and Oracle’s Sun ZFS Storage appliance products to scale linearly—in other words, without any slowdown in performance—as data volumes keep growing. (In this short video interview, Latham explains the business strategy behind Novation’s tech investment.)
Terabytes are still the norm in most places, but a growing number of data-intensive businesses and government agencies are pushing into the petabyte realm. In the latest survey of the Independent Oracle Users Group, 5 percent of respondents said their organizations were managing 1 to 10 petabytes of data, and 6 percent had more than 10 petabytes. You can find the full results of the survey, titled “Big Data, Big Challenges, Big Opportunities,” here.
These burgeoning databases are forcing CIOs to rethink their IT infrastructures. Turkcell , the leading mobile communications and technology company in Turkey, has also turned to Oracle Exadata Database Machine, which combines advanced compression, flash memory, and other performance-boosting features, to condense 1.2 petabytes of data into 100 terabytes for speedier analysis and reporting.
Envisioning a Yottabyte
Some of these big data projects involve public-private partnerships, making best practices of utmost importance as petabytes of information are stored and shared. On the new federal brain-mapping initiative, the National Institutes of Health is collaborating with other government agencies, businesses, foundations, and neuroscience researchers, including the Allen Institute, the Howard Hughes HHC +3.5% Medical Institute, the Kavli Foundation, and the Salk Institute for Biological Studies
Space exploration and national intelligence are other government missions soon to generate yottabytes of data. The National Security Agency’s new 1-million-square-foot data center in Utah will reportedly be capable of storing a yottabyte.
That brings up a fascinating question: Just how much storage media and real-world physical space are necessary to house so much data that a trillion bytes are considered teensy-weensy? By one estimate, a zettabyte (that’s 10 to the twenty-first power) of data is the equivalent of all of the grains of sand on all of Earth’s beaches.
Of course, IT pros in business and government manage data centers, not beachfront, so the real question is how can they possibly cram so much raw information into their data centers, and do so when budget pressures are forcing them to find ways to consolidate, not expand, those facilities?
The answer is to optimize big data systems to do more with less—actually much, much more with far less. I mentioned earlier that mobile communications company Turkcell is churning out analysis and reports nearly 10 times faster than before. What I didn’t say was that, in the process, the company also shrank its floor space requirements by 90 percent and energy consumption by 80 percent through its investment in Oracle Exadata Database Machine, which is tuned for these workloads.
Businesses will find that there are a growing number of IT platforms designed for petabyte and even exabyte workloads. A case in point is Oracle’s StorageTek SL8500 modular library system, the world’s first exabyte storage system. And if one isn’t enough, 32 of those systems can be connected to create 33.8 exabytes of storage managed through a single interface.
So, as your organization generates, collects, and manages terabytes upon terabytes of data, and pursues an analytics strategy to take advantage of all of that pent-up business value, don’t underestimate how quickly it adds up. Think about all of the grains of sand on all of Earth’s beaches, and remember: The goal is to build sand castles, not get buried by the sand.