Infrastructure at Scale

2004 - 2016 · UC Santa Cruz, UC Berkeley, UCLA

The projects in this chapter run at a scale most people cannot visualize: exabytes of storage, ten billion devices, the compute infrastructure for modern AI. They started as dissertations, summer projects, and PhD students frustrated with rewriting the same Python code. They ended up as global standards, hardware architectures, and the nervous system of the internet.


Ceph: The Dissertation That Funded an OSPO

The project: Ceph distributed storage system Campus: UC Santa Cruz (Storage Systems Research Center) Period: 2004 Key figures: Sage Weil (PhD student), Scott A. Brandt (advisor)

Draft - fill in: Sage Weil’s dissertation at UCSC’s SSRC under Prof. Scott A. Brandt; the design problem (exabyte-scale distributed storage without a central lookup table - using CRUSH algorithm for deterministic placement); the Linux kernel merge in 2010; Weil founding Inktank to commercialize it; Red Hat acquiring Inktank in 2014; the Linux Foundation launching the Ceph Foundation in 2018 for neutral governance; crucially: the proceeds from Inktank’s commercialization funding CROSS (Center for Research in Open Source Software) at UCSC - which became the first UC OSPO and the seed of the network this document represents.

The direct line to this document

The money from Ceph’s commercialization funded CROSS at UCSC. CROSS became the first UC OSPO. The UC OSPO Network grew from CROSS. This history was written by an organization that exists because one dissertation was released with an open license and commercialized responsibly.


RISC-V: The Summer Project in Ten Billion Devices

The project: RISC-V open instruction set architecture Campus: UC Berkeley Period: Summer 2010 Key figures: Krste Asanovic, David Patterson, Andrew Waterman, Yunsup Lee

Draft - fill in: Asanovic and Patterson wanting a clean, open instruction set for their graduate students to build chips without paying ARM royalties; the three-month summer project; the decision to make it fully open and royalty-free; the formation of the RISC-V Foundation; the dramatic relocation to Switzerland in 2020 to stay neutral in US-China trade tensions (both countries were adopting RISC-V for strategic hardware independence); the current scale: 350+ member organizations, 70 countries, 10 billion+ devices shipped.

What it became

RISC-V is now a global strategic asset. Countries use it to build hardware without dependency on US or UK intellectual property. The EU has invested in RISC-V for semiconductor independence. China uses it extensively. The foundation moved to Switzerland specifically to be credibly neutral - no one country can claim it.


Named Data Networking: Rethinking the Internet

The project: Named Data Networking (NDN / NFD) Campus: UCLA Period: 2010 (as a Future Internet Architecture project) Key figures: Lixia Zhang (Lead PI, Professor of Computer Science)

Draft - fill in: Lixia Zhang’s background - growing up in rural China, driving tractors during the Cultural Revolution, earning her PhD at MIT under David Clark, working at Xerox PARC, joining UCLA. Zhang was the only woman and the only student at the very first IETF meeting in 1986. She coined “middlebox,” designed RSVP (Resource Reservation Protocol), and is an inductee of the Internet Hall of Fame. NDN as a proposal to fundamentally rethink internet architecture: instead of routing by address (where data lives), route by name (what data is). The $14M NSF Future Internet Architecture project. The ongoing research and deployment work.

Why this belongs here

NDN is not finished infrastructure - it is active research into what the internet could become. It belongs in this document because its lead PI is one of the most consequential figures in internet history, and because it represents the UC system’s ongoing contribution to internet architecture, not just applications running on top of it.


Ray: Two PhD Students and the OpenAI Stack

The project: Ray distributed computing framework Campus: UC Berkeley RISELab Period: 2016 Key figures: Robert Nishihara, Philipp Moritz, Ion Stoica, Michael Jordan

Draft - fill in: The RISELab context (successor to AMPLab); Nishihara and Moritz as PhD students repeatedly rewriting Python code every time they needed to scale to more than one machine; Ray as a general-purpose distributed computing framework designed to make Python programs scale to clusters as easily as they run on a laptop; the adoption by OpenAI for their training and inference infrastructure; the Linux Foundation donation in 2025; the current scale of usage across AI and ML workloads.

What it became

Ray is core infrastructure for modern AI compute. OpenAI uses it. Virtually every large AI training workload runs on it or something derived from it. The Linux Foundation took stewardship in 2025 - following the same playbook as Ceph, Spark, and RISC-V: build at a university, donate to a neutral foundation, commercialize the services.

NoteStatus

Draft scaffold. Each section needs full narrative treatment - 800-1,200 words per project.