AI Server #4 - NVIDIA DGX H100 AI Appliance
The NVIDIA DGX H100 is a cutting-edge AI appliance designed to meet the demands of high-performance computing (HPC) and artificial intelligence (AI) workloads. Here’s an overview of its specifications, features, and applications.
What was the goal
The NVIDIA DGX H100 AI Appliance was developed to address the increasing demands of artificial intelligence (AI) in large language model and other AI workloads. The DGX H100 is equipped with eight NVIDIA H100 Tensor Core GPUs, which provide more performance than previous generations.
Overall evaluation
The DGX H100 leverages the Hopper architecture, which optimizes the training and inference of AI models. It includes features like the Transformer Engine for enhancing generative AI models and support for FP8 precision, allowing faster computations with reduced memory usage compared to higher precision formats 16. This architecture is particularly suited for demanding applications in fields such as healthcare, finance, and scientific research.
Specification
ARCHITECTURE | Hopper |
GPU | 8x NVIDIA H100 Tensor Core GPUs |
GPU Memory | 640GB total, (80GB*8) |
Performance | 32 petaFLOPS FP8 |
NVIDIA NVSwitch | 4x |
System power usage | 10.2kW max |
CPU | Dual Intel® Xeon® Platinum 8480C Processors 112 Cores total, 2.00 GHz (Base), 3.80 GHz (Max Boost) |
System memory | 2TB |
Networking | 4x OSFP ports serving 8x single-port NVIDIA
ConnectX-7 VPI > Up to 400Gb/s InfiniBand/Ethernet 2x dual-port QSFP112 NVIDIA ConnectX-7 VPI > Up to 400Gb/s InfiniBand/Ethernet |
Management network | 10Gb/s onboard NIC with RJ45 100Gb/s Ethernet NIC Host baseboard management controller (BMC) with RJ45 |
Storage | OS: 2x 1.92TB NVMe M.2 8x 3.84TB NVMe U.2 |
OS | Ubuntu Linux |
Weight | 287.6lbs (130.45kgs) 376lbs (170.45kgs) (packaged) |
Dimensions | Height: 14.0in (356mm) Width: 19.0in (482.2mm) Length: 35.3in (897.1mm) |
Operating temperature | 5–30°C (41–86°F) |
Key takeaways
- High performance: The DGX H100 excels in benchmark tests, showcasing substantial improvements over previous generations. For instance, it demonstrates a 46.6 times speed increase compared to the original DGX-1 model. In specific tasks like natural language processing (NLP) using BERT, it achieved a 17% improvement in per-accelerator performance, highlighting its efficiency in processing large datasets.
- Networking: Equipped with NVIDIA Mellanox HDR InfiniBand, the DGX H100 supports high-speed data transfer and low-latency communication between nodes. This feature is crucial for distributed training across multiple systems, significantly speeding up model training processes.
- Power consumption: When compared to earlier models like the DGX A100, the H100 offers enhanced processing power and efficiency. While the A100 also features eight GPUs, the H100's architecture allows for better performance per watt—approximately 15 times more efficient than the original DGX-1.
- Memory size: The DGX H100 is equipped with eight NVIDIA H100 Tensor Core GPUs, providing a total of 640 GB of GPU memory. Each GPU has 80 GB of HBM2e memory, which is crucial for handling large datasets and complex models typically used in AI workloads. This high-bandwidth memory type allows for faster data access and improved performance in memory-intensive tasks.
- Memory bandwidth: The memory bandwidth of the DGX H100 is exceptionally high, reaching up to 24 TB/s. This bandwidth is facilitated by the advanced NVIDIA NVLink technology, which offers 900 GB/s of bidirectional bandwidth between GPUs. The use of four NVIDIA NVSwitches further enhances inter-GPU communication, allowing for efficient data transfer and processing across the system.
- Storage Capabilities: The storage configuration includes two 1.92 TB NVMe M.2 SSDs for operating system tasks and eight 3.84 TB NVMe U.2 SSDs for data caching. This setup provides a total internal storage capacity of approximately 30 TB, enabling rapid access to data and efficient handling of large datasets required for AI model training. The use of NVMe technology ensures that data read/write speeds are significantly faster than traditional SATA or SAS drives..
-
Power usage: The DGX H100 has a maximum power consumption of 10.2 kW.
This high demand necessitates careful planning for data center environments
where the appliance will be deployed. Input Specifications:
Input Voltage: The system operates on 200-240V AC.
Frequency: It supports both 50 Hz and 60 Hz power systems.
Current: The device requires a current of 16A.
The DGX H100 utilizes four separate power inputs, each connected to an internal power supply rated at 3,300 W. This configuration helps distribute the power load evenly across the system, enhancing reliability and performance during intensive computational tasks.
While the maximum consumption can reach up to 10.2 kW, typical operational power usage ranges between 6.5 kW and 8.5 kW, depending on the workload intensity. This variability means that during less demanding tasks, the system may draw significantly less power.
More information
- AI server - 2 nvidia RTX 3090
- AI server 2 - 4 Nvidia RTX 3090
- AI server 3
- AI server NVIDIA DGX H100
- NVIDIA DGX B200 AI Appliance