Archives — Distributed Readings

Dropbox ↗

How low-bit inference enables efficient AI

Running AI inference for products like Dropbox Dash at scale is expensive and resource-intensive, requiring efficient use of compute and memory to make the product accessible to a broad user base.

ml-systems storage-systems

3 min

Dropbox ↗

Half-Quadratic Quantization of large machine learning models

Large machine learning models require significant memory and compute resources, making deployment and inference expensive and slow, especially in resource-constrained environments.

ml-systems storage-systems

3 min