How to use a Local AI model (LLM in GGUF format)

Harnessing the power of local AI models in GGUF format with Ozeki AI Studio allows every organization to take advantage of Large Language Model technolgoy. This article outlines the steps needed to start using an LLM served by your own computer.

Using a Local AI model (Quick steps)

  1. Download an AI model from Hugging Face
  2. Place the .GGUF file inside C:\AIModels
  3. Setup the AI model in the Ozeki AI studio
  4. Create a Chat Bot with your AI model
  5. Test Chat Bot with prompts

Step 1 - Download an AI model from the web

Thousands of AI models can be found on the web. You can download them from huggingface.co. You need to download a model in GGUF format. Make sure to choose a model this suits your hardware capacity. If you run the AI on a general purpose GPU, a model with 3B parameters is a good coice. If you use a GPU you can go for larger models. On an entry level Nvidia Geforce RTX3090 for example an LLM with 7 billion (7B) or 8 billion (8B) will return responses in less then 2 seconds.

This webpages has a video that shows how you can download GGUF models from huggingface

How to download AI models in GGUF format:
http://inside.ozeki.hu/p_8456-download-ai-models-from-huggingface.html

If you save your downloaded. GGUF model files into the following directory, Ozeki AI studio will find them immediately:

C:\AIModels

To start with the AI models release by Facebook (Meta) or by the French team called Mistral will provide great results. We recommend to search for "meta-llama" on Huggingface.

Copy model file to C://AIModels folder
Figure 1 - Copy model file to C://AIModels folder

Step 2 - Setup the AI model in Ozeki AI studio

Once the model is downloaded the next step is to set it up in Ozeki AI studio. Click on AI studio in the Ozeki desktop (Figure 2), then select "AI models" in the toolbar and click on "Create new AI model" (Figure 3).

Open AI studio app
Figure 2 - Open AI studio app

This will bring up a form where you can select an AI model type. Select GGUF from the list. (Note that this type list might change as Ozeki releases new Ozeki AI versions.) After GGUF is selected you will be presented with the model's configuration form (Figure 4).

Note: If you run the system on an NVidia GPU, go to the HW tab on this form, and increase the GPU layers parameter to 130.

For NVidia GPUs you must also install the NVidia CUDA toolkit, and you must set the AI execution model to NVidia in the settings.

Create new GGUF model
Figure 3 - Create new GGUF model

The model name can be typed in manually or it can be selected from the drop-down list. The drop-down list will include files it finds in the C:\AIModels directory. This is why we recommend you to save your model files to this location.

Select model file
Figure 4 - Select model file

Step 3 - Create a Chat Bot with your AI model

Once your model is setup and ready all you have to do is configure it in an AI chatbot. In the Ozeki system, an AI chatbot is responsible for managing chat sessions with multiple users and maintaining independent chat history for each user. Multiple chatbots can use the same AI model, which is great because you don't have to load the module multiple times into the memory (RAM, and you don't have to operate a different GPU for each chatbot. This is one of the strengths of Ozeki.

To configure a chatbot to use your newly downloaded AI model, you must select "Chatbots" in the toolbar (Figure 5) and you must click "Create new Chatbot" in the top. In the chatbot type selector select AI chat (figure 5).

Figure 5 - Create a new chat bot

Once the chatbot type is selected, you can give your chatbot a name, and you can tell it which AI model to use (Figure 6). You can also customize the system prompt and other settings for your bot on this cofniguration form.

Figure 6 - AI chat bot configuration

After hitting OK, your chatbot is ready to use. It might take a few seconds for it to load your model, but once it's loaded you are ready to chat with your local LLM (Figure 7). You will love this great AI tool, and you will like the privacy it gives when you put it to work.

Figure 7 - How to chat with your local LLM

Summary

In conclusion, utilizing a local AI model in GGUF format with Ozeki AI Studio offers a great solution for managing AI-driven interactions. By following the steps to download, set up, and configure your AI model, you can seamlessly integrate advanced AI capabilities into your communication systems. This setup not only enhances efficiency and response times but also ensures data privacy and security by keeping sensitive information within your local network.

With the ability to handle multiple chat sessions and maintain independent chat histories, Ozeki AI Studio provides a powerful and flexible platform for leveraging AI in various applications.

More information