Since launching Magic Pocket last year, we’ve been storing and serving more than 90 percent of our users’ data on our own custom-built infrastructure, which has helped us to be more efficient and improved performance for our users globally.
But with about 75 percent of our users located outside of the United States, moving onto our own custom-built data center was just the first step in realizing these benefits. As our data centers grew, the rest of our network also expanded to serve our users — more than 500 million around the globe — at light-speed with a consistent level of reliability,
More than a billion files are saved to Dropbox every day, and we need to run many asynchronous jobs in response to these events to power various Dropbox features. Examples of these asynchronous jobs include indexing a file to enable search over its contents, generating previews of files to be displayed when the files are viewed on the Dropbox website, and delivering notifications of file changes to third-party apps using the Dropbox developer API. This is where Cape comes in — it’s a framework that enables real-time asynchronous processing of billions of events a day,
A SaaS company like Dropbox needs to update our systems constantly, at all levels of the stack. When it comes time to tune some piece of infrastructure, roll out a new feature, or set up an A/B test, it’s important that we can make changes and have them hit production fast.
Making a change to our code and then “simply” pushing it is not an option: completing a push to our web servers can take hours, and shipping a new mobile or desktop platform release takes even longer. In any case, a full code deployment can be dangerous because it could introduce new bugs: what we really want is a way to put some configurable “knobs” into our products,
Dropbox has hundreds of millions of registered users, and we’re always hard at work to ensure our customers have a speedy, reliable experience, wherever they are. Today, I am excited to announce an expansion to our global infrastructure that will deliver faster transfer speeds and improved performance for our customers around the world.
To give all of our users fast, reliable network performance, we’ve launched new Points of Presence (PoPs) across Europe, Asia, and parts of the US. We’ve coupled these PoPs with an open-peering policy, and as a result have seen consistent speed improvements.
Large-scale networks are complex, dynamic systems with many parts, managed by many different teams. Each team has tools they use to monitor their part of the system, but they measure very different things. Before we built our own infrastructure, Magic Pocket, we didn’t have a global view of our production network, and we didn’t have a way to look at the interactions between different parts in real time. Most of the logs from our production network have semi-structured or unstructured data formats, which makes it very difficult to track a large amount of log data in real-time.
For Firefly, Dropbox’s full-text search engine, speed has always been a priority. (For more background on Firefly, check out our blog post). When our team saw search latency deteriorate from 250 ms to 1000 ms (95th percentile), we knew what to do—we measured, we analyzed, we fixed.
In order to create a good user experience for Firefly, we strive to keep our query latency under 250 ms (at 95th percentile). We noticed that our latency had deteriorated quite a bit since we started adding users to the system.