The Dropbox Detection and Response Team (DART) detects and mitigates information security threats to our employees, infrastructure, and customer data. DART ingests security-relevant logs for building detection, threat hunting and responding to potential incidents. Our log volume is huge, averaging tens of terabytes a day.
The problem we’re solving
Apart from building detections to track suspicious behavior and triaging incidents, we also spend large chunks of our time triaging false positive alerts and building context around individual alerts. This was time not spent hunting for attackers. As a result, any way to automate or improve triage process efficiency was appealing.
At Dropbox, we are building smart features that use machine intelligence to help reduce people’s busywork. Since introducing content suggestions, which we described in our previous blog post, we have been improving the underlying infrastructure and machine learning algorithms that power content suggestions.
One new challenge we faced during this iteration of content suggestions was the disparate types of content we wanted to support. In Dropbox, we have various kinds of content—files, folders, Google Docs, Microsoft Office documents, and our own Dropbox Paper.
Layer-7 load balancing (LB) is a core building block to scale today’s web services. In an earlier post, we introduced Bandaid, the service proxy we built in house at Dropbox that load balances the vast majority of our user requests to backend services. More balanced load distribution among backend servers is desirable because it improves the performance and reliability of our services which benefits our users. Our Traffic/Runtime team recently explored leveraging real-time load information from backend servers to make better load balancing decisions in Bandaid. In this post, we will share the experiences and results, as well as discussing load balancing in general.
Dropbox is a big user of Python. It’s our most widely used language both for backend services and the desktop client app (we are also heavy users of Go, TypeScript, and Rust). At our scale—millions of lines of Python—the dynamic typing in Python made code needlessly hard to understand and started to seriously impact productivity. To mitigate this, we have been gradually migrating our code to static type checking using mypy, likely the most popular standalone type checker for Python. (Mypy is an open source project, and the core team is employed by Dropbox.)
Dropbox has been one of the first companies to adopt Python static type checking at this scale.
Open source is not just for software. The same benefits of rapid innovation and community validation apply to hardware specifications as well. That’s why I’m happy to write that the v1.0 of the RunBMC hardware spec has been contributed to Open Compute Project (OCP). Before I get into what BMCs (baseboard management controllers) are and why modern data centers are dependent on them, let’s zoom out to what companies operating at cloud scale have learned.
Cloud software companies like Dropbox have millions, and in some cases, billions of users. When these cloud companies started building out their own data centers,
Until very recently, Dropbox had a technical strategy on mobile of sharing code between iOS and Android via C++. The idea behind this strategy was simple—write the code once in C++ instead of twice in Java and Objective C. We adopted this C++ strategy back in 2013, when our mobile engineering team was relatively small and needed to support a fast growing mobile roadmap. We needed to find a way to leverage this small team to quickly ship lots of code on both Android and iOS.
We have now completely backed off from this strategy in favor of using each platforms’ native languages (primarily Swift and Kotlin,