This quote refers to Amazon’s S3 (Simple Storage Service) which “holds more than 566 billion data objects and processes more than 370,000 requests per second for them at peak times”, according the the Storage Newsletter article. The 566 Billion number is up 116% from the fourth quarter of last year! Of course, all of this data does not belong to one customer. Amazon S3 stores objects for thousands of customers around the world. Thousands of customers using Amazon S3 to harness the power of the cloud and big data.
I love the quote by Storage Newsletter that clearly shows the demand for capacity in cloud storage, but don’t automatically assume the bigger the drive the better for cloud storage and/or big data implementations. The complexity of Amazon S3 and cloud computing in general is no different from a storage perspective than that of an on-premise data center. Data centers whether they “live” on-premise, or in the cloud, are physical locations running thousands of servers, petabytes of storage, hundreds of applications, and billions of data objects. Each use case will differ from the next not only in terms of capacity, but application, and most importantly workload. What are customers demanding of their data and thus their servers and storage? How quickly do they need the data? How often? And for what period of time? Answering such questions ultimately dictates the the data requirements and thus the storage.
Of course, the answers may differ from one customer to the next and from one application to the next. I wish we could say for this customer type use drive A, or for this application type use drive B. That may have been true 10 years ago, when we had, for example, Seagate Cheetah 10K SCSI drives for the enterprise. Oh – gotta build a server or storage array – you need a SCSI drive. It basically came down to interface and performance. Man, have times changed.
Today, when choosing an enterprise drive or mix of enterprise drives (tiered storage architecture), customers have dozens of variations including choices in terms of: technology (HDD, SSD) interface (SAS, SATA, FC), form factor (2.5″, 3.5″), capacity, performance (7200, 10K, 15K RPM), encryption, power consumption, etc.
But why have they changed…
I talked about customer differences, application differences, but what has driven the complexity into storage more than anything else is data. Data is getting more and more complex. Just look at what customers are bombarded with everyday.
- Data Growth
- Data Lifecycle
- Hot Data
- Warm Data
- Cold Data
- Data Protection
- Data Security
- Data Compliance
- Data Deduplication
- Data Tiers
- Data Analytics
- Big Data
- What others can you come up with?
As Jan Brady might say, “data, data, data.”
You could argue that services like Amazon S3 attempt to take a lot of the complexity out of data storage for many customers. By offloading to a cloud provider like Amazon, customers need not worry about data tiers, hot data, warm data, cold data, data protection, data security, data compliance, etc. and instead can focus on data analytics and what Big Data can deliver to their business in terms of knowledge and thus innovation.
It sounds like that is what they are doing…566 billion data objects and 370 thousand requests at a time.
Storage’ new best friend is an elephant?
Is IBM’s 120 Petabyte array the future of storage?
Why smarter people precedes smarter storage
[Infographic]: Big data’s potential is… gigantic!
Wanted big time: data scientists