In last week’s post, Kat described how we redesigned our new user experience from the ground up to make it a delight for users to get started on Dropbox from our mobile apps. In this post, I’ll go into more detail about everything we did to make the mobile-to-desktop transition simple for users. To recap the previous post, here’s a summary of the flow:
- The desktop connect flow allows users to log in to the website with only their phone, without typing in credentials, and initiates a download.
- The “meta-installer” downloads quickly because of its small file size. When launched, it sets up permissions for Dropbox, starts downloading the full installer, and completes installation automatically.
- When Dropbox is fully installed, the server, browser, and desktop client work together to validate the client and allow it to safely log the user in without requiring the user to enter their email and password again.
How Desktop Connect Works
The desktop connect flow begins on the phone. It instructs the user to go to dropbox.com/connect on their computer (we added a picture of a computer too, just to make things extra clear). Our goal in this flow is to get the user set up with Dropbox on their desktop, minimizing any risk of losing them along the way. Trying to go to the connect website from the phone’s browser is one possible failure mode. On our servers, we generate a QR code with a unique token and display it on the website. To make the flow more fun, we display the QR codes inside a Dropbox. Here’s what the background image looks like. We place identical copies of the QR codes (embedded as inline images with data urls) in each of the empty boxes.
When the user taps ‘Next’ on their phone, a camera appears with instructions to point it at the computer. The phone auto-focuses and captures the QR code. It knows to ignore any QR codes not generated by Dropbox. Once it successfully captures the QR code, it extracts the unique token and communicates it to the server, along with the user ID of the user currently signed in on the phone. The browser, meanwhile, is polling the server for new information about the unique token. When the server hears from the phone, it can tell the browser that the QR code’s unique token is now associated with a user. The browser then authenticates the users and logs them in, without requiring their username or password at all. To convey that this transition has occurred, the browser greets the user by name:
The browser immediately starts downloading the meta-installer. The full Dropbox installer is huge—it’s about 35MB. The Dropbox client that runs on Windows, Mac OS, and Linux is built using Python, so the installer includes the entire (slightly modified) Python runtime that must be installed on every machine in order for Dropbox to run successfully. With such long download times, users can get distracted easily and totally forget about Dropbox by the time the installer finishes downloading. To solve this problem, we created what we call the ‘meta-installer.’ The meta-installer is a very small executable that’s just a placeholder to grab the full installer from a CDN. This executable preps the machine for installation (it handles acquiring permissions to complete the install, for instance) while it fetches the payload from the server in parallel.
The meta-installer has another huge advantage. One big problem for us on the product/engineering teams was that we had almost no insight into what was going on for a particular user after they initiated the download. Did the user launch the installer? Did the user’s installation complete successfully, or did it fail at a particular spot? How did this behavior differ with speed of internet connections? Which download route was the most successful, and which wasn’t? One of the keys to solving this problem was to be able to generate some kind of token that helps us associate an instance of a download with the installation, embed it in the installer, and trace it through the funnel. The meta-installer gave us an opportunity to do this. Because it’s such a small download, we can serve it ourselves instead of through a CDN. We can choose to serve a unique binary for every request. We tag the meta-installer with an ID while the larger full installer is still a static resource that we can host on a CDN.
How The Meta-Installer and Auto-Signin Work
To generate the meta-installer, we start with a standard template installer binary that contains a 1K buffer (known as the tag buffer) somewhere in the binary. The exact location (and implementation) is platform-specific. For every installer request, we clone the template, generate a token, and then replace the empty tag buffer with this token.
There was another interesting challenge here, though. Modern OSes require that each binary is signed by the publisher. The template installer was signed using Dropbox’s key, but modifying it with the token would invalidate the signature. One option to fix this was to defer signing the installer until the point where we had embedded the token, but that posed some performance problems. Instead, we created a custom version of the signing tools which complied with the Authenticode spec (for Windows) while letting us safely modify content for each binary. Our custom tool allows us to create an unverified section of the binary in a way that is compliant with the Authenticode spec. We make the tag buffer an unverified section so that the tag buffer can change without having to re-sign the binary.
So finally when the full installer runs, it knows where to look for the token in the tag buffer. It can use the token to report progress or report any failures along the way. This enables us to get a holistic view into user behavior and issues in Dropbox’s install funnel.
At this point, the Dropbox desktop client is running on a user’s machine. However, there’s still a lot of room for failure. Users might be busy with other applications and forget to log in. Some might have forgotten their username or password. They may not feel like looking it up, changing it, or just typing it in.
To fix this, we leveraged our solution to the meta-installer tagging problem above to create a secure means of authenticating the user on the desktop client using their credentials from the web session. The meta-installer solution above enables us to embed arbitrary tokens in the 1k tag buffer, so we embedded a token affiliated with the user’s identity. When the user installs and runs the desktop client, the client can look at that buffer and use the token to log in as the user automatically.
Of course, we needed to address security aspects for this solution to be broadly deployed. For instance, user A shouldn’t be able to download an installer, send it to user B, and have user B’s computer linked to user A’s account. Similarly, user X should not be able to generate arbitrary links that can trick user Y into downloading an installer that links their machine to user X’s account. We needed to be conservative and follow smart heuristics to determine when not to automatically sign in a user and fall back to requiring an email and password. In order to solve these and some other security concerns, we leveraged the tag buffer, browser cookies, and a client-side nonce. The browser, desktop client, and server work together to verify the user’s identify. It works like this:
- The server generates a meta-installer that is tagged with two bits of information: a new, unique, one-time user token that identifies the logged in user and the browser that was used to download the installer. The token is associated with a user, can only ever be used to link a computer once, and is valid for a small time window.
- When the installer runs, it pings the server with the user token to check whether the token is still valid and unused, and sends over a client-generated nonce. The server now associates the client nonce and the user token with each other.
- After Dropbox is installed, it launches the same browser that was used to initiate the download (remember, we embedded this information in the meta-installer) and directs the user to a page on www.dropbox.com with the user token and the client nonce. This page has some security restrictions in place to make sure it wasn’t embedded in an iframe or linked to by another site. The page POSTs the user’s token and nonce to the server over SSL.
- The server cross checks the token and nonce provided with the user’s cookie. The token must be valid, unused, and must be linked to the same user as the one identified by the browser’s cookie. If everything looks good, the server instructs the Dropbox desktop client to log in as the user, and everything appears magical.
By leveraging the browser cookie, we ensure that the desktop client only logs in if the same user who downloaded it is logged in on the browser for that particular machine. You can’t force somebody else to log in to your Dropbox by sending them your binary. Browser tag validation and cross-checking it with the client nonce helps prevent attacks where the flow is interrupted and continued on another device. We also have a few other security and reporting measures implemented to ensure that things work smoothly. If any of our conditions isn’t met, we abort auto sign-in and ask the user to log in with an email and password. Additionally, for users who have enabled 2-factor authentication or use SSO to log in, we don’t do auto sign-in.
By the end of this flow, the user has a fully functioning, logged in, and ready-to-use desktop version of Dropbox up and running. They never had to type in their email or password. Even though there’s a lot going on behind the scenes, to the user everything appears so smooth that it’s expected. When we first tested this flow in user studies, almost nobody noticed that anything unusual had happened. That’s when we felt we had done our job well. To the user, it just works. The flow is now live for Android, iOS, Mac, and Windows.
If you’re excited about building product-driven technology, come join us!