In the last post, Stephen explored the asynchronous client-server communication model we use in Carousel to provide a fast user experience, where interactions aren’t blocked on network calls. But network latency was not our only hurdle. In order to make the app feel fast and responsive, we also needed to minimize user-visible disk latency. One of Carousel’s primary goals was to make all of the user’s photos and videos accessible in one continuous view. We have users with over 100,000 photos in Dropbox, and we needed to build a metadata-loading pipeline that could accommodate them. This meant building a metadata-loading system that can show photos on-screen within a second of opening the app and provides smooth scrolling even as the user navigates back in time through their entire history of photos.
Our users love Dropbox for many reasons, sync performance being chief among them. We’re going to look at a recent performance improvement called Streaming Sync which can improve sync latency by up to 2x.
Prior to Streaming Sync, file synchronization was partitioned into two phases: upload and download. The entire file must be uploaded to our servers and committed to our databases before any other clients could learn of its existence. Streaming sync allows file contents to “stream” through our servers between your clients.
The Dropbox File System
First we’ll discuss the way Dropbox stores and syncs files.
Dropbox owes a large share of its success to Python, a language that enabled us to iterate and develop quickly. However, as our infrastructure matures to support our ever growing user base, we started exploring ways to scale our systems in a more efficient manner. About a year ago, we decided to migrate our performance-critical backends from Python to Go to leverage better concurrency support and faster execution speed. This was a massive effort–around 200,000 lines of Go code–undertaken by a small team of engineers. At this point, we have successfully moved major parts of our infrastructure to Go.
One recurring theme that hindered our development progress was the lack of robust libraries needed for building large systems.
When we began the journey of building a mobile app for Dropbox a few years ago, we started simple — our Android and iOS apps allowed our users to view their files on the go, and cache them for offline access. As smartphones became more popular, we realized we could provide another great service on mobile: automatic backup of all the photos and videos taken on these devices, so they’d be safe forever in Dropbox.
Last Wednesday, on April 9, we took another giant leap forward with the introduction of Carousel. Carousel is a single home for all your photos and videos,
Hello everyone, I’m very excited to announce Pyston, a new open-source implementation of Python, currently under development at Dropbox. The goal of the project is to produce a high-performance Python implementation that can push Python into domains dominated by traditional systems languages like C++.
Here at Dropbox, we love Python and try to use it for as much as we can. As we scale and the problems we tackle grow, though, we’re starting to find that hitting our performance targets can sometimes become prohibitively difficult when staying on Python. Sometimes, it can be less work to do a rewrite in another language.
Every day millions of people upload videos to Dropbox. Besides wanting their memories safe forever, they also want to be able to watch them at any time and on any device. The playout experience should feel instant, despite the fact that the content is actually stored remotely. Low latency playback of content poses interesting technical challenges because of the three main factors below.
1. Codec diversity
Most end users are familiar with extensions like .mp4, .avi, .flv, but not everybody is familiar with the fact that the file extension does not necessarily match the internal encoding of the content.