In our previous blog posts on Dropbox’s document scanner (Part 1, Part 2 and Part 3), we focused on the algorithms that powered the scanner and on the optimizations that made them speedy. However, speed is not the only thing that matters in a mobile environment: what about memory? Bounding both peak memory usage and memory spikes is important, since the operating system may terminate the app outright when under memory pressure. In this blog post, we will discuss some tweaks we made to lower the memory usage of our iOS document scanner.
In our previous blog posts (Part 1, Part 2), we presented an overview of various parts of Dropbox’s document scanner, which helps users digitize their physical documents by automatically detecting them from photos and enhancing them. In this post, we will delve into the problem of maintaining a real-time frame rate in the document scanner even in the presence of camera movement, and share some lessons learned.
Document scanning as augmented reality
Dropbox’s document scanner shows an overlay of the detected document over the incoming image stream from the camera.
Dropbox’s document scanner lets users capture a photo of a document with their phone and convert it into a clean, rectangular PDF. It works even if the input is rotated, slightly crumpled, or partially in shadow—but how?
In our previous blog post, we explained how we detect the boundaries of the document. In this post, we cover the next parts of the pipeline: rectifying the document (turning it from a general quadrilateral to a rectangle) and enhancing it to make it evenly illuminated with high contrast. In a traditional flatbed scanner,
A few weeks ago, Dropbox launched a set of new productivity tools including document scanning on iOS. This new feature allows users to scan documents with their smartphone camera and store those scans directly in their Dropbox. The feature automatically detects the document in the frame, extracts it from the background, fits it to a rectangular shape, removes shadows and adjusts the contrast, and finally saves it to a PDF file. For Dropbox Business users, we also run Optical Character Recognition (OCR) to recognize the text in the document for search and copy-pasting.
Making Carousel highly responsive was a critical part of providing an awesome user experience. Carousel wouldn’t be as usable or effective if the app stuttered or frequently caused users to wait while content loaded. In our last post, Drew discussed how we optimized our metadata loading pipeline to respond to data model changes quickly, while still providing fast lookups at UI bind time. With photo metadata in memory, our next challenge was drawing images to the screen. Dealing with tens of thousands of images while rendering at 60 frames per second was a challenge, especially in the mobile environments of iOS and Android.
In last week’s post, Kat described how we redesigned our new user experience from the ground up to make it a delight for users to get started on Dropbox from our mobile apps. In this post, I’ll go into more detail about everything we did to make the mobile-to-desktop transition simple for users. To recap the previous post, here’s a summary of the flow:
- The desktop connect flow allows users to log in to the website with only their phone, without typing in credentials, and initiates a download.
- The “meta-installer” downloads quickly because of its small file size.