Open source is not just for software. The same benefits of rapid innovation and community validation apply to hardware specifications as well. That’s why I’m happy to write that the v1.0 of the RunBMC hardware spec has been contributed to Open Compute Project (OCP). Before I get into what BMCs (baseboard management controllers) are and why modern data centers are dependent on them, let’s zoom out to what companies operating at cloud scale have learned.
Cloud software companies like Dropbox have millions, and in some cases, billions of users. When these cloud companies started building out their own data centers,
A year ago, we became the first major tech company to adopt high-density SMR (Shingled Magnetic Recording) technology for our storage drives. At the time, we faced a challenge: while SMR offers major cost savings over conventional PMR (Perpendicular Magnetic Recording) drives, the technology is slower to write than conventional drives. We set out on a journey to reap the cost-saving benefit of SMR without giving up on performance. One year later, here’s the story of how we achieved just that.
The Best Surprise Is No Surprise
When the first production machines started arriving in September,
At Dropbox, we run more than 35,000 builds and millions of automated tests every day. With so many tests, a few are bound to fail non-deterministically or “flake.” Some new code submissions are bound to break the build, which prevents developers from cutting a new release. At this scale, it’s critical we minimize the manual intervention necessary to temporarily disable flaky tests, revert build-breaking commits, and notify test owners of these issues. We built a system called Athena to manage build health and automatically keep the build green.
What we used to do
To ensure basic correctness,
Ever since we launched Magic Pocket, our in-house multi-exabyte storage system, we’ve been continuously looking for opportunities to improve efficiency, while maintaining our high standards for reliability. Last year, we pushed the limits of storage density by being the first major tech company to adopt SMR storage. In this post, we’ll discuss another advance in storage technology at Dropbox: a new cold storage tier that’s optimized for less frequently accessed data. This storage runs on the same SMR disks as our more active data, and through the same internal network.
The Lifetime of a file
The access characteristics of a file at Dropbox varies heavily over time.
Apache Kafka is a popular solution for distributed streaming and queuing for large amounts of data. It is widely adopted in the technology industry, and Dropbox is no exception. Kafka plays an important role in the data fabric of many of our critical distributed systems: data analytics, machine learning, monitoring, search, and stream processing (Cape), to name a few.
At Dropbox, Kafka clusters are managed by the Jetstream team, whose primary responsibility is to provide high quality Kafka services. Understanding Kafka’s throughput limit in Dropbox infrastructure is crucial in making proper provisioning decision for different use cases,
Dropbox needs its underlying network infrastructure to be reliable, high-performing, cost-effective, and truly scalable. In previous posts we described how the edge network was designed to improve user performance, and how the supporting multi-terabit backbone network spans continents to interconnect edge PoPs and multiple data centers.
In this post we describe how we evolved the Dropbox data center network from the legacy chassis based four-post architecture to a scalable multi-tier, quad-plane fabric. Also, we successfully deployed our first fabric at our newest data center in California earlier this year!
Dropbox network physical footprint
We currently have global network presence and multiple data centers in California,