GIGABYTE and NVIDIA have lengthy partnered to develop NVIDIA-certified techniques for GPU computing use circumstances resembling synthetic intelligence (AI), high-performance computing (HPC), digital desktop (VDI), edge computing, 5G, rendering farm, skilled graphics processing and extra. To handle the multitude of use circumstances, GIGABYTE presents the most important portfolio of GPU compute server options in the marketplace, with modular system design and configurability in thoughts.
The options include optimized air cooling and preparation for DLC cooling and immersion cooling (in partnership with Asperitas, CoolIT, GRC, Submer and plenty of others). The portfolio continues to broaden as the most recent computing applied sciences from main CPU/GPU producers enter the market, all aiming for the best compute density, efficiency and vitality effectivity.
Among the many varied licensed techniques, the next fashions are of explicit curiosity for this text: G292-Z20, R282-Z96, G492-ZD2 and immersion cooling techniques.
G292-Z20 – Essentially the most dense GPU computing platform
Primarily based on the most recent AMD EPYC 7002 / 7003 CPU structure, the G292-Z20 system design has a single CPU socket and depends on the big variety of AMD EPYC CPU cores (as much as 64 cores) to regulate as much as 8 NVIDIA GPU playing cards (PCIe kind). issue, double-slot or single-slot sizes). Unified reminiscence house (as in a single NUMA) throughout CPU, system reminiscence, GPU, and community units supplies the best computing efficiency with the bottom latency in knowledge motion. Whether or not in a naked steel configuration or in virtualization, the G292-Z20 can assure the optimum distribution of computing sources.
G292-Z20 comes with 8x PCIe Gen4 slots for NVIDIA GPU, 1x CPU socket for AMD EPYC, 8x DDR4 3200MHz DIMM slots, 8x hot-swap drive bays (the place 2 bays assist NVMe PCIe Gen3 and 6 SATA/6 bays), 2x PCIe Gen4 enlargement slots for extra units resembling HBA FC / storage playing cards and NVIDIA SmartNIC to speed up knowledge switch between nodes and clusters and GPUDirect/RDMA. These compact, GPU-centric computing options are of explicit curiosity to HPC customers working with synthetic intelligence, molecular simulations, genomic sequencing, climate prediction, and different use circumstances.
The G292-Z20 additionally comes with immersion cooling preparation. The article addresses this matter on the finish.
R282-Z96 – a flexible, common GPU computing platform
The R282-Z96 comes with twin CPU sockets for AMD EPYC 7002 / 7003 processors (as much as 64 cores every socket), assist for as much as 3 NVIDIA GPU playing cards (PCIe kind issue, dual-slot or single-slot sizes), and prolonged choices for PCIe add-in card configuration.
The 32 built-in DIMM slots present as much as 4TB of DDR4 ECC reminiscence (or as much as 8TB utilizing 3DS LRDIMM modules). For native storage, the R282-Z96 has one M.2 storage slot and 12 hot-swap 3.5″/2.5″ SATA/SAS HDD/SSD drive bays. There’s additionally an optionally available NVMe equipment for integrating U.2 NVMe PCIe Gen4 drives.
Most significantly, the R282-Z96 system design supplies a balanced NUMA structure throughout the 2 CPU domains: system reminiscence, native storage, and PCIe slots are evenly distributed, making certain optimum efficiency and decreasing efficiency bottlenecks in demanding workloads.
The R282-Z96 is subsequently a perfect resolution for VDI and HPC. For instance, two NVIDIA GPU playing cards resembling A16 and A40 can be utilized for low/mid/excessive finish digital desktops and digital purposes. The NVIDIA A30 and A100 can be utilized for containerization in AI growth and for molecular evaluation, particle simulation, genomic sequencing, climate prediction, and different HPC workloads that require balanced CPU-GPU computing sources.
G492-ZD2 – Essentially the most highly effective GPU system with NVIDIA A100 SXM4 and NVLink
The G492-ZD2 is amongst GIGABYTE’s best-selling fashions: the system relies on 8x NVIDIA A100 SXM4 GPUs and 2x AMD EPYC CPU sockets and presents the likelihood to put in as much as 10x NVIDIA SmartNICs to hurry up knowledge switch between nodes and clusters and GPUDirect/ RDMA. Licensed for RHEL and VMWare, the G492-ZD2 can also be appropriate for offering most multi-instance GPU (MIG) periods for AI builders who run workloads in varied containerized environments and require customized algorithms, libraries and datasets to run in remoted person premises. .
The system makes use of a brand new cooling resolution that dedicates a cooling chamber for NVIDIA GPUs and SmartNICs used within the PCIe enlargement slots, making certain the best attainable airflow for cooling high-performance elements. The system really consists of two separate components: a 3U GPU sled that sits on prime of a 1U server housing the CPU, system reminiscence, storage bays, and entrance PCIe slots. The 3U GPU sled permits straightforward changeover in case of system upkeep, given the complicated on-board interconnects that join all GPU modules and the 1U server.
The inclusion and choices of NVIDIA A100 SXM4 modules within the G492-ZD2 system is essential as a result of the brand new NVIDIA Magnum IO GPUDirect applied sciences favor sooner throughput whereas offloading workloads from the CPU to realize efficiency beneficial properties. The G492-ZD2 helps NVIDIA GPUDirect RDMA for direct knowledge change between GPUs and third-party units resembling NICs or storage adapters. And there is assist for GPUDirect Storage for a direct knowledge path to maneuver knowledge from storage to GPU reminiscence whereas offloading the CPU, leading to increased bandwidth and decrease latency.
State-of-the-art HPC coaching: Liquid-cooled and immersion-cooled servers
At GIGABYTE, we’re seeing a drastic improve in demand for direct liquid cooling (DLC) and immersion cooling (primarily single-phase based mostly) in comparison with the pre-COVID period. The demand comes primarily from knowledge middle operators and cloud service suppliers (CSPs) who’re involved concerning the steady improve in computing energy and thus the ensuing warmth manufacturing by computing elements (in particularly by CPU and GPU).
We help knowledge facilities and clients with their design evaluation, energy consumption, warmth dissipation, house optimization and PUE/water use effectivity (WUE), amongst many different technical subjects, at each step of resolution design.
Taking it a step additional, GIGABYE additionally presents set up/implementation providers working with knowledge middle infrastructure firms to make sure shoppers obtain easy venture supply and quick turnaround time for operational readiness. Most significantly, GIGABYTE strongly advises its clients to reap the benefits of its Proof-of-Idea (PoC) sources to validate every design resolution and venture parameters to make the perfect choice, as many environmental components might alter efficiency anticipated and system stability. GIGABYTE has PoC models (each single-phase and dual-phase immersion cooling) for testing and validating immersion cooled servers. Server mannequin choices are available 1U/2U/4U kind components and may be modified on demand to suit completely different use circumstances and workloads. GIGABYTE works with all the main liquid cooling and immersion cooling know-how companions in the marketplace in order that clients can depend on the design compatibility of GIGABYTE’s whole options with their infrastructure.
Conclusion
Past immediately’s HPC know-how and additional into This fall 2022, 2023 and past, GIGABYTE is poised to launch next-generation GPU computing options in partnership with NVIDIA. GIGABYTE will proceed to handle varied use circumstances by adapting the system design to real-world workflows and knowledge middle architectures.
Verify the present right here GIGABYTE – NVIDIA Promotional Marketing campaign.