Archives — Distributed Readings

Airbnb ↗

Scaling Airbnb’s identity graph with a unified knowledge graph infrastructure

Airbnb needed to scale their identity graph infrastructure to efficiently resolve user identities and understand relationships between entities across their platform.

databases distributed-systems

5 min

Google ↗

Build Long-running AI agents that pause, resume, and never lose context with ADK

Building production-grade AI agents that can maintain context and state across long-running enterprise workflows spanning days or weeks without losing information during idle periods or server restarts.

api-design distributed-systems

5 min

Cloudflare ↗

Our billing pipeline was suddenly slow. The culprit was a hidden bottleneck in ClickHouse

A partitioning change to a petabyte-scale ClickHouse cluster caused billing pipeline jobs to stall without obvious error signals in standard metrics.

databases observability

4 min

Meta ↗

Reel Friends: Building Social Discovery that Scales to Billions

Building a social discovery system that efficiently surfaces Reels watched and reacted to by friends while scaling to billions of users.

caching distributed-systems

5 min

Netflix ↗

Stop Answering the Same Question Twice: Interval-Aware Caching for Druid at Netflix Scale

Query performance degradation at massive scale (10+ trillion rows, 15M events/second) where repeated identical queries were consuming excessive resources and impacting latency.

caching databases

5 min

Airbnb ↗

Skipper: Building Airbnb’s embedded workflow engine

How to build a durable workflow execution engine that can recover from failures mid-process without losing state or duplicating work.

distributed-systems databases

5 min

AWS ↗

Real-time analytics: Oldcastle integrates Infor with Amazon Aurora and Amazon Quick Sight

Oldcastle needed to overcome the limitations of traditional ERP reporting to enable real-time analytics and dashboards for their Infor ERP system.

databases real-time-systems

5 min

Cloudflare ↗

Agents that remember: introducing Agent Memory

AI agents lack persistent memory mechanisms to retain context, learn from interactions, and improve decision-making over time.

storage-systems ml-systems

3 min

Cloudflare ↗

Deploy Postgres and MySQL databases with PlanetScale + Workers

Enabling serverless applications to connect to managed relational databases without managing infrastructure or dealing with connection pooling complexities.

databases api-design

3 min

LinkedIn ↗

Driving data enhancement & recruitment success with LinkedIn’s unified integrations

LinkedIn's recruitment platform needed richer data signals to improve candidate matching and recruiter success rates.

search databases

3 min

Airbnb ↗

Academic Publications & Airbnb Tech: 2025 Year in Review

Airbnb needed to advance its AI, data science, and machine learning capabilities across multiple domains (NLP, optimization, measurement science) to improve its travel and living platform, requiring solutions to challenges in search ranking, recommendation, experimentation, and large-scale data processing.

ml-systems search

5 min

Airbnb ↗

From Static Rate Limiting to Adaptive Traffic Management in Airbnb’s Key-Value Store

Airbnb's multi-tenant key-value store (Mussel) used static rate limiting that couldn't adapt to varying traffic patterns and spikes, risking degraded performance and reliability for all tenants during surges.

rate-limiting distributed-systems

5 min

Netflix ↗

Automating RDS Postgres to Aurora Postgres Migration

Netflix's relational database ecosystem lacked standardization, with databases spread across RDS Postgres and other technologies, leading to inconsistent functionality, suboptimal performance, and higher total cost of ownership.

databases distributed-systems

5 min

Netflix ↗

Scaling Global Storytelling: Modernizing Localization Analytics at Netflix

Netflix's localization analytics infrastructure (tracking dubbing, subtitling, and translation across hundreds of languages and regions) could not keep pace with the rapidly growing scale of global content, making it difficult to derive timely insights for content localization decisions.

databases distributed-systems

5 min

Netflix ↗

The AI Evolution of Graph Search at Netflix

Netflix's Graph Search platform for federated enterprise data required users to write structured queries, limiting accessibility and ease of use despite the system being scalable and configurable.

search ml-systems

5 min