Our users love Dropbox for many reasons, sync performance being chief among them. We’re going to look at a recent performance improvement called Streaming Sync which can improve sync latency by up to 2x.
Prior to Streaming Sync, file synchronization was partitioned into two phases: upload and download. The entire file must be uploaded to our servers and committed to our databases before any other clients could learn of its existence. Streaming sync allows file contents to “stream” through our servers between your clients.
The Dropbox File System
First we’ll discuss the way Dropbox stores and syncs files.
Dropbox owes a large share of its success to Python, a language that enabled us to iterate and develop quickly. However, as our infrastructure matures to support our ever growing user base, we started exploring ways to scale our systems in a more efficient manner. About a year ago, we decided to migrate our performance-critical backends from Python to Go to leverage better concurrency support and faster execution speed. This was a massive effort–around 200,000 lines of Go code–undertaken by a small team of engineers. At this point, we have successfully moved major parts of our infrastructure to Go.
One recurring theme that hindered our development progress was the lack of robust libraries needed for building large systems.
When we began the journey of building a mobile app for Dropbox a few years ago, we started simple — our Android and iOS apps allowed our users to view their files on the go, and cache them for offline access. As smartphones became more popular, we realized we could provide another great service on mobile: automatic backup of all the photos and videos taken on these devices, so they’d be safe forever in Dropbox.
Last Wednesday, on April 9, we took another giant leap forward with the introduction of Carousel. Carousel is a single home for all your photos and videos,
Hello everyone, I’m very excited to announce Pyston, a new open-source implementation of Python, currently under development at Dropbox. The goal of the project is to produce a high-performance Python implementation that can push Python into domains dominated by traditional systems languages like C++.
Here at Dropbox, we love Python and try to use it for as much as we can. As we scale and the problems we tackle grow, though, we’re starting to find that hitting our performance targets can sometimes become prohibitively difficult when staying on Python. Sometimes, it can be less work to do a rewrite in another language.
Every day millions of people upload videos to Dropbox. Besides wanting their memories safe forever, they also want to be able to watch them at any time and on any device. The playout experience should feel instant, despite the fact that the content is actually stored remotely. Low latency playback of content poses interesting technical challenges because of the three main factors below.
1. Codec diversity
Most end users are familiar with extensions like .mp4, .avi, .flv, but not everybody is familiar with the fact that the file extension does not necessarily match the internal encoding of the content.
Dropbox brings your photos, videos, documents, and other files to any platform: mobile, web, desktop, or API. Over time, through automatic camera uploads on iOS and Android, you might save thousands of photos, and this presents a performance challenge: photo thumbnails need to be accessible on all devices, instantly.
We pre-generate thumbnails at various resolutions for the different devices at upload time, to reduce the cost of scaling photos at rendering time. But when users are quickly scrolling through many photos, we need to request a large number of thumbnails. Since most platforms have limitations on the number of concurrent requests,