In our previous blog posts (Part 1, Part 2), we presented an overview of various parts of Dropbox’s document scanner, which helps users digitize their physical documents by automatically detecting them from photos and enhancing them. In this post, we will delve into the problem of maintaining a real-time frame rate in the document scanner even in the presence of camera movement, and share some lessons learned.
Document scanning as augmented reality
Dropbox’s document scanner shows an overlay of the detected document over the incoming image stream from the camera.
Dropbox’s document scanner lets users capture a photo of a document with their phone and convert it into a clean, rectangular PDF. It works even if the input is rotated, slightly crumpled, or partially in shadow—but how?
In our previous blog post, we explained how we detect the boundaries of the document. In this post, we cover the next parts of the pipeline: rectifying the document (turning it from a general quadrilateral to a rectangle) and enhancing it to make it evenly illuminated with high contrast. In a traditional flatbed scanner,
A few weeks ago, Dropbox launched a set of new productivity tools including document scanning on iOS. This new feature allows users to scan documents with their smartphone camera and store those scans directly in their Dropbox. The feature automatically detects the document in the frame, extracts it from the background, fits it to a rectangular shape, removes shadows and adjusts the contrast, and finally saves it to a PDF file. For Dropbox Business users, we also run Optical Character Recognition (OCR) to recognize the text in the document for search and copy-pasting.
Making Carousel highly responsive was a critical part of providing an awesome user experience. Carousel wouldn’t be as usable or effective if the app stuttered or frequently caused users to wait while content loaded. In our last post, Drew discussed how we optimized our metadata loading pipeline to respond to data model changes quickly, while still providing fast lookups at UI bind time. With photo metadata in memory, our next challenge was drawing images to the screen. Dealing with tens of thousands of images while rendering at 60 frames per second was a challenge, especially in the mobile environments of iOS and Android.
In last week’s post, Kat described how we redesigned our new user experience from the ground up to make it a delight for users to get started on Dropbox from our mobile apps. In this post, I’ll go into more detail about everything we did to make the mobile-to-desktop transition simple for users. To recap the previous post, here’s a summary of the flow:
- The desktop connect flow allows users to log in to the website with only their phone, without typing in credentials, and initiates a download.
- The “meta-installer” downloads quickly because of its small file size.
At Dropbox, we treat growth as an integral part of the product experience. We look at major holes in user experience that slow growth, and we try to be creative in addressing the big picture, rather than trying to “growth hack.” We look for solutions that enable users to experience the full value of Dropbox.
Dropbox has always been about accessing your stuff anywhere. Back when Dropbox launched six years ago, that meant installing Dropbox on your desktop and then accessing photos and docs on the web or on a smartphone, for some. Our smartphone apps were a way of helping users who already had Dropbox on their desktop view docs on the go,