In this post we will take you behind the scenes on how we built a state-of-the-art Optical Character Recognition (OCR) pipeline for our mobile document scanner. We used computer vision and deep learning advances such as bi-directional Long Short Term Memory (LSTMs), Connectionist Temporal Classification (CTC), convolutional neural nets (CNNs), and more. In addition, we will also dive deep into what it took to actually make our OCR pipeline production-ready at Dropbox scale.
Most representations of data contain a lot of redundancy, which provides an opportunity for greater communication efficiency by compressing the content. Compression is either built-in into the data format — like in the case of images, fonts, and videos — or provided by the transportation medium, e.g. the HTTP protocol has the
Content-Encoding header pair that allows clients and servers to agree on a preferred compression method. In practice though, most servers today only support
Dropbox employs traditional cross-site attack defenses, but we also employ same-site cookies as a defense in depth on newer browsers. In this post, we describe how we rolled out same-site cookie based defenses on Dropbox, and offer some guidelines on how you can do the same on your website.
Recently, the IETF released a new RFC introducing same-site cookies.
Keeping users on the latest version of the Dropbox desktop app is critical. It allows our developers to rapidly innovate, showcase new features to our users, maintain compatibility with server endpoints, and mitigate risk of incompatibilities that may creep in with platform/OS changes.
Our auto-update system, as originally designed, was written as a feature of the desktop client. Basically, as part of regular file syncing, the server can send down an entry in the metadata that says, “Please update to version X with checksum Y.” The client would then download the file, verify the checksum, open the payload,
A SaaS company like Dropbox needs to update our systems constantly, at all levels of the stack. When it comes time to tune some piece of infrastructure, roll out a new feature, or set up an A/B test, it’s important that we can make changes and have them hit production fast.
Making a change to our code and then “simply” pushing it is not an option: completing a push to our web servers can take hours, and shipping a new mobile or desktop platform release takes even longer. In any case, a full code deployment can be dangerous because it could introduce new bugs: what we really want is a way to put some configurable “knobs” into our products,