Planning GPU deployment in virtualized environments – Part 2

In my previous post, I explained some planning considerations when implementing GPUs in a virtualized environment. There are a few other considerations, especially in terms of hardware, when you implement GPUs in your virtualized environment. This article will contain some of these considerations.

Blade hardware

When planning to implement GPUs in existing hardware, be sure that the current hardware does support the installation of a GPU. If your current environment, for example, consists of blade servers (like Cisco UCS B-series or HP BL series), your environment will not be able to be upgraded with a GPU. This means that you have to invest in new hardware.

Hardware Compatibility

If you’re planning on implementing vGPU, the first place you should check is the hardware compatibility list for virtual GPU devices. Note that this list contains hardware which has been certified for vGPU on XenServer, it does not mean that other hardware will not work.

PCIe

If you’re buying new hardware to host your GPU-enabled virtual desktop environment on, be sure to do some investigation on the supported configuration. Like stated before, Cisco UCS B-series (blades) do not allow PCI (graphics) cards to be installed. This means that you probably would be looking at Cisco C-series rack servers. But depending on your load-per-server, the exact server choice could differ.

To find out whether or not the GPU will fit in the hardware you selected, first take a look at the GPU specifications. If you’re planning on implementing vGPU functionality, you will be looking at NVIDIA GRID cards. The NVIDIA GRID K1 and K2 cards have the following specifications and dimensions in regards to PCI:

PCIe slot x16
Board width Dual slot
Board length About 3/4-length (10.5 in.)
Board height Full height (4.4 in.)

Looking at the Cisco UCS C240 M3 specifications, you can see that it contains 5 PCIe slots (numbers 2, 3and 10 on the image):

Cisco C240 M3
Cisco C240 M3 Back

However, this does not mean this server can be installed with 5 NVIDIA GRID GPU’s. The dimensions of the GRID boards are dual-slot and full-height, meaning that for example slot PCIe 4 can’t be used, since this is half-height and half-length. Besides the dimensions, PCIe 4 is a x8 slot, while the GRID board needs x16. The available slots for the GRID board are PCIe 5 and PCIe 2. PCIe 1 can’t be used, since the GRID board is dual-slot (meaning the board would not fit), PCIe 3 is half-length, while a GRID board needs about 3/4 length at least. (check the spec sheet for full info)

Now if you think “Great! Hook me up with a couple of these C240’s and throw in 2 GRID K2’s and I’m good to go”, you’re right. But… there’s one caveat. What if you would like a Cisco VIC 1285 card installed? Here’s the catch… You can’t install 2 GRID (K1 or K2) in this server AND a Cisco VIC. So you would need to figure out what the best configuration is for you.

Next example I’m using is the Cisco C260 M2:

CiscoUCS C260 M2
CiscoUCS C260 M2

This rack server has a total of 6 PCIe slots. If you’ve checked the GPU hardware compatibility list, you will have noticed that the list does not contain the C260. Reason is pretty simple, it has only one standard-profile PCIe slot, which can’t fit the GRID board, since it doesn’t have room for dual-slot cards (besides, it is only half-length).

My last example is the Cisco UCS C460 M4:

CiscoUCS C460 M4
CiscoUCS C460 M4

This is a 4-unit rack server. It contains a whopping 10 PCIe slots! So… If the GRID boards are dual-slot, I could install 5 boards right? Unfortunately, wrong, since only 3 of those PCIe slots are x16. This server allows only 2 GRID boards (in PCIe slots 7 and 2, check the specs). Again… there’s a caveat. It will only allow 2 GRID boards if you install all 4 CPU’s in this server. (the riser card for PCIe slot 2 is controlled by CPU 1, PCIe slot 7 is controlled by CPU 3)

Conclusion

While there are a lot of servers out there which allow installation of a GPU, the amount of GPUs differ greatly as well as the way you have to configure the hardware. You will need to do some investigation in the available options. Luckily, most vendors (like Cisco) offer a usefull configuration tool which allows you to configure the server correctly. Be sure to check the GPU Passthrough HCL or vGPU HCL for XenServer to narrow down the possible vendors and servers.

I’m focussing on Cisco hardware in this article, but everything explained does apply to other hardware (like HP Proliant DL380  or Dell R720). I hope this article was usefull for you. If you have any questions or remarks, leave a comment or feel free to send me an email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Complete the following sum: * Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.