Browse past weeks of engineering reads.
Netflix's Ranker service had a video serendipity scoring feature (computing how different a title is from a user's watch history) consuming ~7.5% of total CPU per node, creating a significant performance bottleneck at their enormous scale.
Netflix needed scalable, deep machine-level understanding of every piece of content across an expanding catalog (including live events and podcasts) to power recommendations and discovery, but building separate models per content type and modality doesn't scale.
Netflix needed to spin up hundreds of containers in seconds to serve streaming traffic, but after modernizing their container runtime, they hit an unexpected performance bottleneck rooted in CPU architecture that impaired container scaling efficiency.
Netflix needed reliable orchestration for business-critical cloud operations across teams like Open Connect CDN and Live reliability, but faced operational challenges as Temporal adoption grew since 2021.