Facebook and HP Show Different Visions for Web-scale
Recently we’ve had a chance to look again at two very conflicting views from HP and Facebook on how to do web-scale and cloud computing, both announced at the recent OCP annual event in California.
From HP come its new CloudLine systems, the public face of their joint venture with Foxcon. Early details released by HP show a line of cost-optimized servers descended from a conventional engineering lineage and incorporating selected bits of OCP technology to reduce costs. These are minimalist rack servers designed, after stripping away all the announcement verbiage, to compete with white-box vendors such as Quanta, SuperMicro and a host of others. Available in five models ranging from the minimally-featured CL1100 up through larger nodes designed for high I/O, big data and compute-intensive workloads, these systems will allow large installations to install capacity at costs ranging from 10 – 25% less than the equivalent capacity in their standard ProLiant product line. While the strategic implications of HP having to share IP and market presence with Foxcon are still unclear, it is a measure of HP’s adaptability that they were willing to execute on this arrangement to protect against inroads from emerging competition in the most rapidly growing segment of the server market, and one where they have probably been under immense margin pressure.
The complementary Open Compute vison from Facebook, somewhat reminiscent of HP’s iconoclastic Moonshot server, is the new “Yosemite” SOC modular computer design from Facebook. Yosemite is a modular chassis design for SOC computers, designed to accommodate a range of SOCs and complying with the OCP common slot specification, which will allow the mixing of different server SOCs in the same chassis[i]. Yosemite is an interesting example of a moderate density and elegant modular packaging optimized, intended to optimize for maximum throughput per rack and per dollar for light to medium web workloads. With a 90W per module TDP a Yosemite module will easily accommodate one of the new Xeon D-1500 SOCs (Intel was listed as a joint contributor for Yosemite) with memory and either SSD or rotating storage, and shared connectivity and power among the four modules.
For infrastructure architects, these two visions of dense and cost-optimized infrastructures clearly show a major bifurcation in systems thinking about building at scale. It’s not just about aggregate compute power – all of the available options allow for packing more than 1000 cores per rack at different power and performance levels per core. It’s more about a holistic approach to infrastructure engineering, informed by accurate characterization of workload characteristics. HP’s approach of conventional cost-reduced architecture mirrors the mainstream largely 2S OCP and ODM approach, which requires little deviation from conventional architectures extending from the server workload through the server itself and up through the storage and data center networking layer, and there is immense accumulated tribal knowledge in the industry about how to design such infrastructures. The alternative suggested by Yosemite and paradoxically the entirely proprietary Moonshot is one that requires a more granular approach to cost and workload analysis, involving as it does a larger number of lower-powered server nodes, but offering potentially more flexibility in end-to-end design and optimization.
Architecture at scale is clearly a fruitful arena for further research, and I’d love to hear from anyone who is using or contemplating either OCP or Moonshot implementations at scale.
[i] My initial thoughts on mixing different architectures in a system were “not in a thousand years”, but as I look at the options for specialized accelerators – FPGA, DSP, GPU – the notion of configuring customized racks and modules for specific workloads becomes a lot more attractive. Depending on the available interconnects, the ability to compose at a rack level makes a lot of sense for large workloads.