In our previous blog posts on Dropbox’s document scanner (Part 1, Part 2 and Part 3), we focused on the algorithms that powered the scanner and on the optimizations that made them speedy. However, speed is not the only thing that matters in a mobile environment: what about memory? Bounding both peak memory usage and memory spikes is important, since the operating system may terminate the app outright when under memory pressure. In this blog post, we will discuss some tweaks we made to lower the memory usage of our iOS document scanner.
In our previous blog posts (Part 1, Part 2), we presented an overview of various parts of Dropbox’s document scanner, which helps users digitize their physical documents by automatically detecting them from photos and enhancing them. In this post, we will delve into the problem of maintaining a real-time frame rate in the document scanner even in the presence of camera movement, and share some lessons learned.
Document scanning as augmented reality
Dropbox’s document scanner shows an overlay of the detected document over the incoming image stream from the camera.
Dropbox’s document scanner lets users capture a photo of a document with their phone and convert it into a clean, rectangular PDF. It works even if the input is rotated, slightly crumpled, or partially in shadow—but how?
In our previous blog post, we explained how we detect the boundaries of the document. In this post, we cover the next parts of the pipeline: rectifying the document (turning it from a general quadrilateral to a rectangle) and enhancing it to make it evenly illuminated with high contrast. In a traditional flatbed scanner,