Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning

In this post we will take you behind the scenes on how we built a state-of-the-art Optical Character Recognition (OCR) pipeline for our mobile document scanner. We used computer vision and deep learning advances such as bi-directional Long Short Term Memory (LSTMs), Connectionist Temporal Classification (CTC), convolutional neural nets (CNNs), and more. In addition, we will also dive deep into what it took to actually make our OCR pipeline production-ready at Dropbox scale.

In previous posts we have described how Dropbox’s mobile document scanner works. The document scanner makes it possible to use your mobile phone to take photos and

Read more

Deploying Brotli for static content

Introduction

Most representations of data contain a lot of redundancy, which provides an opportunity for greater communication efficiency by compressing the content. Compression is either built-in into the data format — like in the case of images, fonts, and videos — or provided by the transportation medium, e.g. the HTTP protocol has the Accept-Encoding / Content-Encoding header pair that allows clients and servers to agree on a preferred compression method. In practice though, most servers today only support gzip.

In this blog post, we are going to share our experiences with rolling out Brotli encoding for static content used by dropbox.com,

Read more

Preventing cross-site attacks using same-site cookies

Our comms team told us we need an image; our legal team told us it needed to be freely licensed. Credit: Carsten Schertzer (Creative Commons Attribution 2.0)

 

Dropbox employs traditional cross-site attack defenses, but we also employ same-site cookies as a defense in depth on newer browsers. In this post, we describe how we rolled out same-site cookie based defenses on Dropbox, and offer some guidelines on how you can do the same on your website.

Background

Recently, the IETF released a new RFC introducing same-site cookies.

Read more

DropboxMacUpdate: Making automatic updates on macOS safer and more reliable

Keeping users on the latest version of the Dropbox desktop app is critical. It allows our developers to rapidly innovate, showcase new features to our users, maintain compatibility with server endpoints, and mitigate risk of incompatibilities that may creep in with platform/OS changes.

Our auto-update system, as originally designed, was written as a feature of the desktop client. Basically, as part of regular file syncing, the server can send down an entry in the metadata that says, “Please update to version X with checksum Y.” The client would then download the file, verify the checksum, open the payload,

Read more

Introducing Stormcrow

A SaaS company like Dropbox needs to update our systems constantly, at all levels of the stack. When it comes time to tune some piece of infrastructure, roll out a new feature, or set up an A/B test, it’s important that we can make changes and have them hit production fast.

Making a change to our code and then “simply” pushing it is not an option: completing a push to our web servers can take hours, and shipping a new mobile or desktop platform release takes even longer. In any case, a full code deployment can be dangerous because it could introduce new bugs: what we really want is a way to put some configurable “knobs” into our products,

Read more