We’ve received a lot of positive feedback since announcing Magic Pocket, our in-house multi-exabyte storage system. We’re going to follow that announcement with a series of technical blog posts that offer a look behind the scenes at interesting aspects of the system, including our protection mechanisms, operational tooling, and innovations on the boundary between hardware and software. But first, we’ll need some context: in this post, we’ll give a high level architectural overview of Magic Pocket and the criteria it was designed to meet.
There is nothing more important to Dropbox than the safety of our user data. When we set out to build Magic Pocket, our in-house multi-exabyte storage system, durability was the requirement that underscored all aspects of the design and implementation. In this post we’ll discuss the mechanisms we use to ensure that Magic Pocket constantly maintains its extremely high level of durability.
Years ago, we called Dropbox a “Magic Pocket” because it was designed to keep all your files in one convenient place. Dropbox has evolved from that simple beginning to become one of the most powerful and ubiquitous collaboration platforms in the world. And when our scale required building our own dedicated storage infrastructure, we named the project “Magic Pocket.” Two and a half years later, we’re excited to announce that we’re now storing and serving over 90% of our users’ data on our custom-built infrastructure.
Dropbox has hundreds of millions of registered users, and we’re always hard at work to ensure our customers have a speedy, reliable experience, wherever they are. Today, I am excited to announce an expansion to our global infrastructure that will deliver faster transfer speeds and improved performance for our customers around the world.
To give all of our users fast, reliable network performance, we’ve launched new Points of Presence (PoPs) across Europe, Asia, and parts of the US. We’ve coupled these PoPs with an open-peering policy, and as a result have seen consistent speed improvements.
Large-scale networks are complex, dynamic systems with many parts, managed by many different teams. Each team has tools they use to monitor their part of the system, but they measure very different things. Before we built our own infrastructure, Magic Pocket, we didn’t have a global view of our production network, and we didn’t have a way to look at the interactions between different parts in real time. Most of the logs from our production network have semi-structured or unstructured data formats, which makes it very difficult to track a large amount of log data in real-time.