Like many companies, Dropbox uses scribe to aggregate log data into our analytics pipeline. After a recent scribe pipeline outage, we decided to rewrite scribe with the goals of reducing operational overhead, reducing data loss, and adding enhancements that are missing from the original scribe. This blog post describes some of the design choices we made for the rewrite.
This section describes the scribe pipeline with respect to how it is setup at Dropbox (we suspect most companies deploy/use scribe in similar fashion). Feel free to skip this section if you are already familiar with scribe.
Protecting the privacy and security of our users’ information is a top priority for us at Dropbox. In addition to hiring world class experts, we believe it’s important to get all the help we can from the security research community, too. That’s why we’re excited to announce that starting today, we’ll be recognizing security researchers for their effort through a bug bounty program with HackerOne.
Bug bounties (or vulnerability rewards programs) are used by many leading companies to improve the security of their products. These programs provide an incentive for researchers to responsibly disclose software bugs,
With hundreds of billions of files, Dropbox has become one of the world’s largest stores of private documents, and it’s still growing strong! As users add more and more files it becomes harder for them to stay organized. Eventually, search replaces browsing as the primary way users find their content. In other words, the more content our users store with us the more important it is for them to have a powerful search tool available. With this motivation in mind, we set out to deploy instant, full-text search for Dropbox.
Like any serious practitioners of large distributed systems our first order of business was clear: come up with a name for the project!
Dropbox is an active customer of Amazon Web Services, currently operating one of the largest global deployments into S3, tens of thousands of EC2 instances, and heavily utilizing other services like SQS and Route 53. Pushing hundreds of gigabits per second through EC2/S3 is an everyday occurrence for us, and conducting massively parallel operations across our over one trillion objects in S3 happens on an ongoing basis.
However, that’s just half the story. We also have large physical datacenters split between two geographical regions, running tens of thousands of servers responsible for storing and serving the metadata for every file in Dropbox.
Making Carousel highly responsive was a critical part of providing an awesome user experience. Carousel wouldn’t be as usable or effective if the app stuttered or frequently caused users to wait while content loaded. In our last post, Drew discussed how we optimized our metadata loading pipeline to respond to data model changes quickly, while still providing fast lookups at UI bind time. With photo metadata in memory, our next challenge was drawing images to the screen. Dealing with tens of thousands of images while rendering at 60 frames per second was a challenge, especially in the mobile environments of iOS and Android.
In last week’s post, Kat described how we redesigned our new user experience from the ground up to make it a delight for users to get started on Dropbox from our mobile apps. In this post, I’ll go into more detail about everything we did to make the mobile-to-desktop transition simple for users. To recap the previous post, here’s a summary of the flow:
- The desktop connect flow allows users to log in to the website with only their phone, without typing in credentials, and initiates a download.
- The “meta-installer” downloads quickly because of its small file size.