AI Server #2 - Quad NVidia RTX 3090 GPU, water cooled

What was the goal

To have an AI system with 96GB VRam, to allow more models to run simultaneously and to be able to use quantization that offers higher quality responses. This server was built by Chodu Bhagat for deep fake video detection algorithms through AI.

Overall evaluation

This setup offers similar inferencing times to AI server #1 with dual Nvidia RTX3090 when the models are using layer split, however when the GPUs work simultaneously in row split mode, the performance is higher. Thank to the amount of vram it offers higher quality responses due to better precision in the models.

Figure 1 - AI server - 4xRTX3090 - Water cooled

Hardware configuration

CPU AMD Ryzen Threadripper 3970X 32-Core 3.7GHz TRX4
RAM 256GB DDR4 3200 MT/s
Cooling Single loop watercooled system, EK Waterblocks, 2x 480mm Radiators + 1x360mm Radiator, 4 way quad block, 16 fans
GPU 4xNvidia RTX 3090 Zotac Trinity 24GB VRAM
Motherboard Gigabyte TRX40 Designaire
PSU 2x Antec 1300W Signature Series (one for the GPUs, the other for the system)
Case Corsair Obsidian Series 1000D Super-Tower Case

Key takeaways

  • Temperatures: This system generates a lot of heat. Thanks to the 3 radiators On a 100% load - CPU is 81 deg and GPU is 55. Idle Temps : 27 degrees GPU, 37 Degrees CPU ( Ambient Temp 23 degree C).
  • Water cooling: Water cooling is hard to build. In this system non-flexible tubes are used, which must be cut and bent to the correct size. This requires a lot of care and attention. EK fittings and water cooling components were used the flow is managed by a 4 way quad block: EKWB 4way quad.
  • PSU: There are two identical power supplies in the system. This is a great choice, because there is less variance in the output. One PSU powers the 4 GPUs, the other powers the rest of the system.
  • CPU: The AMD Ryzen Threadripper 3960X offers 64 PCIe 4.0 lanes. This is neccessary to communicate with all 4 GPUs at a high speed.
  • Motherboard: The Gigabyte TRX40 Designare (rev1.0) mother board was chosen, because it offers 4 evenly spaced PCIe slots, which allows the 4 GPUs to be inserted and connected neatly. It also provides 8 RAM slots to accomodate the 256GB RAM.
  • RAM: This CPU offers a Quad channel memory bus, it does not support ECC, which is good, because non ECC RAM-s are costing less. 8X32GB modules allow 256GB RAM. A lot of RAM is needed for video AI tasks.
  • Case: This case was selected because it offers a lot of space for placing the radiators and the watercooling components. It offers room for up to 2x360 60mm + 2x480 60mm radiators in push+pull afaik. This case is very heavy. You must be very strong or need two people to lift it up if the system is fully built.

More information