Browse past weeks of engineering reads.
Migrating business-critical load balancer configurations from on-premises hardware solutions to Google Cloud while preserving existing traffic manipulation logic.
Building a multi-tenant architecture that isolates tenants without requiring separate AWS accounts while maintaining stateful service deployments.
Browser Run needed higher usage limits, better performance, and improved reliability while increasing development velocity for their browser automation service.
Netflix needed to design a domain-independent traffic routing system for their ML model serving infrastructure that could handle personalized experiences at scale across multiple domains while maintaining high availability.
Cloudflare needed to build an internal AI engineering stack that could handle massive scale (20 million requests, 241 billion tokens) while dogfooding their own platform products.
Cloudflare needed to improve request handling performance across its global network to maintain competitive advantage over other CDNs.
How to efficiently run inference for extra-large language models on edge infrastructure while maintaining low latency and high throughput across distributed Cloudflare servers.
Developers needed a unified way to access multiple AI model providers without managing separate integrations and API contracts for each one.
Cloudflare Workflows needed to support higher concurrency and creation rate limits to enable durable background agents at scale.
Simplifying the deployment and scheduling of machine learning inference workloads across multiple instances and instance types on Amazon SageMaker HyperPod.
How to scale a global content delivery and DDoS mitigation network to handle massive throughput (500 Tbps) while maintaining capacity to protect against record-breaking attacks.
Magic Transit customers needed the ability to define and enforce custom DDoS mitigation logic for proprietary and non-standard UDP protocols without being limited to Cloudflare's pre-built detection rules.
CDN cache systems were designed for human traffic patterns but struggle with the distinct access patterns of AI bot traffic, which now represents over 10 billion requests per week and threatens cache efficiency.
Meta needed to scale their ads ranking models to LLM-scale complexity and size while maintaining inference latency requirements for real-time ad serving.
Salesforce's Cluster Autoscaler could not efficiently scale and manage node provisioning across their fleet of 1,000+ EKS clusters, likely suffering from slow scaling decisions, suboptimal bin-packing, and operational complexity at massive scale.
Cloudflare's existing server fleet could not keep pace with rapidly growing global traffic demands, requiring a new generation of hardware with significantly higher compute and network throughput.