Browse past weeks of engineering reads.
Determining whether security-focused LLMs can effectively identify vulnerabilities in live production infrastructure code at scale.
Cloudflare needed to scale code review processes across their engineering organization while maintaining code quality and security standards without overwhelming human reviewers.
Cloudflare needed to build an internal AI engineering stack that could handle massive scale (20 million requests, 241 billion tokens) while dogfooding their own platform products.
Providing a scalable, efficient search infrastructure that allows AI agents to dynamically create search instances and perform semantic queries across uploaded documents without managing underlying indexing complexity.
AI agents lack persistent memory mechanisms to retain context, learn from interactions, and improve decision-making over time.
AI agents needed a way to interact with browsers at scale while maintaining visibility and control over automated actions, requiring higher concurrency and real-time debugging capabilities.
How to efficiently run inference for extra-large language models on edge infrastructure while maintaining low latency and high throughput across distributed Cloudflare servers.
Developers needed a unified way to access multiple AI model providers without managing separate integrations and API contracts for each one.
Building a scalable platform for deploying AI agents at the edge that can think, act, and persist state across distributed Cloudflare infrastructure.
GPU memory bandwidth constraints were limiting LLM inference efficiency across Cloudflare's distributed edge network, requiring optimization to deliver faster and cheaper inference.
Detecting sophisticated client-side security threats like zero-day exploits while minimizing false positives in real-time across millions of requests.
How to safely execute untrusted AI-generated code with minimal latency and resource overhead.
Organizations struggle to discover and secure AI-powered applications across their infrastructure, especially shadow AI deployments that teams spin up without central oversight, creating security blind spots.
Running large AI models for agent workloads on edge infrastructure was cost-prohibitive and required significant inference stack optimization to serve models like Kimi K2.5 efficiently at scale.
AI agents hitting Cloudflare error pages received heavyweight HTML responses that consumed excessive tokens and required brittle parsing, making automated error handling inefficient and costly.