How big is S3?

tl;dr: somewhere between 12-40 exabytes.

VP of S3 and Glacier Mai-Lan Tomsen Bukovec in the AWS re:Invent 2018 keynote:

S3 and Glacier manage exabytes of storage—tens of trillions of objects—and we do that across many millions of drives in our data centers. Just to give you a sense of that scale: in a single region s3 will manage peaks of 60 terabits per second in a single day.

Quora’s answer

Sergey Kandaurov, Flexify.IO Founder & CEO answering in Quora: 2.5–5 EB of user data, based on Bukovec’s re:Invent talk.

However, I think Kandaurov may have been underestimating. The quoted number includes Glacier, where people send their data in bulk. First, however, I want to focus on the object counts

Extrapolating from object counts

At the end of 2010 S3 was at 262 billion objects. By June 2012 it was 1 trillion, then it was 2 trillion objects by April 2013. That growth was despite the introduction of object lifecycle policies that could automatically delete objects after a specified period.

The growth curve from those data points is exponential, and if we follow that out it results in 520 trillion objects by the end of 2018. Given that Bukovec didn’t say hundreds of trillions of objects, I’m guessing that actual growth in object counts has slowed from that exponential rush. The polynomial curve fit suggests about 7 trillion objects, but that’s smaller than what Bukovec claimed. However, if we naively double the object counts each year from 2013, we get to 64 trillion objects in 2018—a clean fit for the “tens of trillions of objects” claim.

If we assume 64 trillion objects, the next big question is the average object size. Eric Hammond reported a size of about 70KB for his objects in S3 in a 2012 blog post. Chris Ferris claimed 7MB. Neither explained what their objects were.

My personal experience with an app handling significant volumes of user-generated content was that 32% of objects were less than 128KB, 42% were 128KB to 1MB, and 4% were larger. That suggests a pretty strong line right around 128KB, which is conveniently the minimum size for S3 Infrequent Access (though there are also a number of technical reasons S3IA would not be optimized for small objects). If we allow that objects such as bodycam video, log files, and other common objects will be larger than the material I’ve been working with (though less numerous), let’s push the average up to 192KB. The result is:

  • 64 trillion × 192 kilobytes = 12.288 exabytes

That’s easily within the “exabytes of storage” number Kandaurov named, but if we work from the count of hard drives, we might even get a larger number.

Extrapolating from drive counts

“Many millions of drives” is terribly ambiguous, but if we assume it means somewhere between five and 15 million, and if the average drive size is 4TB (it’s tempting to say 2TB, but given S3’s growth rates I’d expect they’re now installing 8TB drives by the warehouse full), then the numbers work out like this:

  • 4 terabytes × 5 million = 20 exabytes
  • 4 terabytes × 15 million = 60 exabytes

However, erasure coding the objects across multiple drives/hosts in multiple AZs could require 1.3-2x more disk storage than the object size. A multiple of 1.8x, for example, would allow for data retrieval from any 10 out of 18 storage hosts. Assuming a standard model of six hosts per AZ across three AZs, that level of erasure coding could maintain durability and availability even with the loss of hosts from an entire AZ and two additional hosts in other AZs.

But, because each AZ in AWS is actually made up of multiple facilities and each facility is an independent failure domain for massive durability events, I’m actually guessing they’re depending on retrieval from any 12 of 18 storage hosts (three hosts per facility, two facilities per AZ, across three AZs = 18 hosts) for 1.5x more storage than object size.

That overhead turns the 20-60EB raw disks into 13-40EB of objects. The lower end of that matches my estimate by object count, but the upper end suggests my estimate of average object size might be too conservatively small. Or perhaps it suggests 15 is too many for “many.”

Extra: Kandaurov covered AWS storage options (but not size) in more detail in another re:Invent talk.