How to configure your system to use GPU

This page provides the steps you need to take to configure Ozeki AI to use your GPU. After the settings are done, all your local AI models will run on your GPU.

What kind of GPU should I use

In order to choose a suitable GPU, there are two main parameters to look at:

  • GPU RAM: This is the most important parameter. This limits the size of the model you can load.
  • Cuda cores: The number of cuda cores determines the speed

GPU for beginners

For beginners, or entities with limited budget, the minimum GPU we recommend is the NVidia GeForce RTX 3090. This GPU will efficiently run an LLM with 8billion parameters.

GPU for beginners

Step 1 - Switch to Nvidia GPU processing on Ozeki AI studio

Figure 1 - Switch to NVidia GPU processing architecture

Step 2 - Resetart the Ozeki Service

Figure 2 - Restart Ozeki after switching architecture

Step 3 - Increase the number of GPU layers

Figure 3 - GPU layer offloading

Step 4 - Check if the GPU is used during inference

Figure 4 - Check GPU usage

More information