Introducing Cape

More than a billion files are saved to Dropbox every day, and we need to run many asynchronous jobs in response to these events to power various Dropbox features. Examples of these asynchronous jobs include indexing a file to enable search over its contents, generating previews of files to be displayed when the files are viewed on the Dropbox website, and delivering notifications of file changes to third-party apps using the Dropbox developer API. This is where Cape comes in — it’s a framework that enables real-time asynchronous processing of billions of events a day,

Read more

Introducing Stormcrow

A SaaS company like Dropbox needs to update our systems constantly, at all levels of the stack. When it comes time to tune some piece of infrastructure, roll out a new feature, or set up an A/B test, it’s important that we can make changes and have them hit production fast.

Making a change to our code and then “simply” pushing it is not an option: completing a push to our web servers can take hours, and shipping a new mobile or desktop platform release takes even longer. In any case, a full code deployment can be dangerous because it could introduce new bugs: what we really want is a way to put some configurable “knobs” into our products,

Read more

Infrastructure Update: Pushing the edges of our global performance

Dropbox has hundreds of millions of registered users, and we’re always hard at work to ensure our customers have a speedy, reliable experience, wherever they are. Today, I am excited to announce an expansion to our global infrastructure that will deliver faster transfer speeds and improved performance for our customers around the world.

To give all of our users fast, reliable network performance, we’ve launched new Points of Presence (PoPs) across Europe, Asia, and parts of the US. We’ve coupled these PoPs with an open-peering policy, and as a result have seen consistent speed improvements.

Read more

NetFlash: Tracking Dropbox network traffic in real-time with Elasticsearch

Large-scale networks are complex, dynamic systems with many parts, managed by many different teams. Each team has tools they use to monitor their part of the system, but they measure very different things. Before we built our own infrastructure, Magic Pocket, we didn’t have a global view of our production network, and we didn’t have a way to look at the interactions between different parts in real time. Most of the logs from our production network have semi-structured or unstructured data formats, which makes it very difficult to track a large amount of log data in real-time.

Read more

Improving the performance of full-text search

For Firefly, Dropbox’s full-text search engine, speed has always been a priority. (For more background on Firefly, check out our blog post). When our team saw search latency deteriorate from 250 ms to 1000 ms (95th percentile), we knew what to do—we measured, we analyzed, we fixed.

Problem

In order to create a good user experience for Firefly, we strive to keep our query latency under 250 ms (at 95th percentile). We noticed that our latency had deteriorated quite a bit since we started adding users to the system.

Read more