In this post we will take you behind the scenes on how we built a state-of-the-art Optical Character Recognition (OCR) pipeline for our mobile document scanner. We used computer vision and deep learning advances such as bi-directional Long Short Term Memory (LSTMs), Connectionist Temporal Classification (CTC), convolutional neural nets (CNNs), and more. In addition, we will also dive deep into what it took to actually make our OCR pipeline production-ready at Dropbox scale.
In previous posts we have described how Dropbox’s mobile document scanner works. The document scanner makes it possible to use your mobile phone to take photos and “
Most representations of data contain a lot of redundancy, which provides an opportunity for greater communication efficiency by compressing the content. Compression is either built-in into the data format — like in the case of images, fonts, and videos — or provided by the transportation medium, e.g. the HTTP protocol has the
Content-Encoding header pair that allows clients and servers to agree on a preferred compression method. In practice though, most servers today only support
In this blog post, we are going to share our experiences with rolling out Brotli encoding for static content used by dropbox.com,
In our previous blog posts on Dropbox’s document scanner (Part 1, Part 2 and Part 3), we focused on the algorithms that powered the scanner and on the optimizations that made them speedy. However, speed is not the only thing that matters in a mobile environment: what about memory? Bounding both peak memory usage and memory spikes is important, since the operating system may terminate the app outright when under memory pressure. In this blog post, we will discuss some tweaks we made to lower the memory usage of our iOS document scanner.
Imagine you’re an engineer working on a new product feature that is going to have a high impact on the end user, like the Dropbox Badge. You want to get quick validation on the functionality and utility of the feature. Each individual change you make might be relatively simple, like a tweak to the CSS changing the size of a font, or more substantial, like enabling the Badge on a new file type. You could set up user studies, but these are relatively expensive and slow, and are a statistically small sample size. Ideally,
Our comms team told us we need an image; our legal team told us it needed to be freely licensed. Credit: Carsten Schertzer (Creative Commons Attribution 2.0)
Dropbox employs traditional cross-site attack defenses, but we also employ same-site cookies as a defense in depth on newer browsers. In this post, we describe how we rolled out same-site cookie based defenses on Dropbox, and offer some guidelines on how you can do the same on your website.
Recently, the IETF released a new RFC introducing same-site cookies.
Keeping users on the latest version of the Dropbox desktop app is critical. It allows our developers to rapidly innovate, showcase new features to our users, maintain compatibility with server endpoints, and mitigate risk of incompatibilities that may creep in with platform/OS changes.
Our auto-update system, as originally designed, was written as a feature of the desktop client. Basically, as part of regular file syncing, the server can send down an entry in the metadata that says, “Please update to version X with checksum Y.” The client would then download the file, verify the checksum, open the payload,