Meet Bandaid, the Dropbox service proxy

With this post we begin a series of articles about our Service Oriented Architecture components at Dropbox, and the approaches we took in designing them. Bandaid, our service proxy, is one of these components. Follow along as we discuss Bandaid’s internal design and the approaches we chose for the implementation.

Bandaid started as a reverse proxy that compensated for inefficiencies in our server-side services. Later we developed it into a service proxy that accelerated adoption of Service Oriented Architecture at Dropbox.

A reverse proxy is a device or service that forwards requests from multiple clients to servers (i.e.

Read more

Security at scale: the Dropbox approach

The Dropbox Security Team is responsible for securing over 500 petabytes of data belonging to over half a billion registered users across hundreds of thousands of businesses. Securing data at this scale requires a security team that is not only well-resourced, but also one that can keep ahead of the expansion of our platform. We focus on scaling our own leverage, so each new security person we add multiplies the impact of our team.

Over the course of this year—and beyond—we’ll go into more detail on how Dropbox approaches security and some of the projects we’ve tackled. Protecting Dropbox requires serious investments in security.

Read more

Optimizing web servers for high throughput and low latency


This is an expanded version of my talk at NginxConf 2017 on September 6, 2017. As an SRE on the Dropbox Traffic Team, I’m responsible for our Edge network: its reliability, performance, and efficiency. The Dropbox edge network is an nginx-based proxy tier designed to handle both latency-sensitive metadata transactions and high-throughput data transfers. In a system that is handling tens of gigabits per second while simultaneously processing tens of thousands latency-sensitive transactions, there are efficiency/performance optimizations throughout the proxy stack, from drivers and interrupts, through TCP/IP and kernel, to library, and application level tunings.

Read more

Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning

In this post we will take you behind the scenes on how we built a state-of-the-art Optical Character Recognition (OCR) pipeline for our mobile document scanner. We used computer vision and deep learning advances such as bi-directional Long Short Term Memory (LSTMs), Connectionist Temporal Classification (CTC), convolutional neural nets (CNNs), and more. In addition, we will also dive deep into what it took to actually make our OCR pipeline production-ready at Dropbox scale.

In previous posts we have described how Dropbox’s mobile document scanner works. The document scanner makes it possible to use your mobile phone to take photos and

Read more

How Dropbox securely stores your passwords

It’s universally acknowledged that it’s a bad idea to store plain-text passwords. If a database containing plain-text passwords is compromised, user accounts are in immediate danger. For this reason, as early as 1976, the industry standardized on storing passwords using secure, one-way hashing mechanisms (starting with Unix Crypt). Unfortunately, while this prevents the direct reading of passwords in case of a compromise, all hashing mechanisms necessarily allow attackers to brute force the hash offline, by going through lists of possible passwords, hashing them, and comparing the result. In this context, secure hashing functions like SHA have a critical flaw for password hashing: they are designed to be fast.

Read more