Shesha Krishnapura, Intel Fellow and IT CTO gave a talk about Data Center Transformation using disaggregated server architecture at a Supermicro Customer Event I attended in Santa Clara. Below is the Intel D2 P3 Data Center in Silicon Valley where Shesha’s team implemented their money saving strategy. Overall Intel’s 56 data centers at 23 sites consume 92 MW, use more than 280,645 Xeon Servers with 2,128,200+ cores, have more than 348 PB of digital storage and more than 499,300 network ports. The company’s largest data center consumes 31 MW with an area of 30,000 square feet.
Shesha said that there are four major functions that drive IT data center requirements that he abbreviated as DOME. These are Design Computing: silicon/chip design function and HPC and Grid Computing; Office General Purpose: typical IT and internal customer services; manufacturing Fab/ATM: Manufacturing Computing that supports fabrication and assembly; and Enterprise: enterprise applications supporting e-business.
Electronic Design Automation (EDA) is an important activity in designing advanced semiconductor devices at Intel. Electronic Design Automation (EDA) workloads are compute intensive and require many servers to complete the complex simulations rapidly. Shortening the design cycle directly translates into Go-To-Market (GTM) competitive advantages for the company. With disaggregated servers Intel was able to reduce the cost of comparable EDA data center operations since 2006 as shown in the figure below. Their HPC physical operation was a 60.5% cheaper than external cloud offerings.
Intel’s disaggregated server architecture incorporates processors, memory and storage. A lot of the on-going value is that the modules in the servers can be replaced with upgraded equipment without replacing the entire server. This saves on the cost of upgrades and makes it easier to upgrade much faster than if entire racks must be repopulated.
Supermicro, a manufacturer of servers and storage gear provided the disaggregated servers that were at the heart of Intel’s data center savings. Their 3U Microblade System was an important element in Intel’s data center cost reduction. This system is shown below.
Intel deployed over 120,000 of these MicroBlade as well as SuperBlade disaggregated Intel Xeon processor-based servers. CPUs and memory are in separate modules, providing the disaggregation. The MicroBlade architecture enables the independent upgrades of the compute modules without replacing the rest of the MicroBlade enclosure including networking, storage, fans and power supplies, which refresh at a slower rate.
Using this disaggregated architecture Intel data centers had a Power Usage Effectiveness (PUE) of 1.06 versus a traditional data center energy effectiveness of 1.7 PUE. These MicroBlade servers support 14 hot-swappable server blades in 3U and 280 Intel Xeon processor-based servers in a 9 foot (60U) rack. The SuperBlade servers had 10 or 14 blades in a 6U enclosure. According to Supermicro its disaggregated rack scale design optimizes data center refresh cycles and delivers better overall data center performance at 45-65% reduced CAPEX costs.
According to Supermicro, the MicroBlade enclosure is configured with a Chassis Management Module for unified management, integrated network switches to reduce within-rack cabling by up to 99%, and redundant 2000W Titanium Level certified digital power supplies for high energy efficiency (96%+). Up to 86% improvement in cooling fan power efficiency is achieved by sharing four cooling fans and integrated power modules across all 14 MicroBlade server blades.
The Supermicro MicroBlade is shipped with industry standard IPMI 2.0 and Redfish API designed to lower management overhead in large scale data centers. To maximize the utilization of the physical space in the data center, the company uses 9 foot (60U) racks, and packs 20 x 3U MicroBlade enclosures delivering a server density of 280 Xeon processor based servers per rack.
With 40% annual growth in compute, storage and networking, Intel needed new tools to control data center capital and operating costs. Using a disaggregated server architecture allowing separate upgrades to CPU, memory/storage and networking the company was able to save on upgrade as well as operating costs.