How to use a Local AI model (LLM in GGUF format)
Harnessing the power of local AI models in GGUF format with Ozeki AI Studio allows every organization to take advantage of Large Language Model technolgoy. This article outlines the steps needed to start using an LLM served by your own computer.
Using a Local AI model (Quick steps)
- Download an AI model from Hugging Face
- Place the .GGUF file inside C:\AIModels
- Setup the AI model in the Ozeki AI studio
- Create a Chat Bot with your AI model
- Test Chat Bot with prompts
Step 1 - Download an AI model from the web
Thousands of AI models can be found on the web. You can download them from huggingface.co. You need to download a model in GGUF format. Make sure to choose a model this suits your hardware capacity. If you run the AI on a general purpose GPU, a model with 3B parameters is a good coice. If you use a GPU you can go for larger models. On an entry level Nvidia Geforce RTX3090 for example an LLM with 7 billion (7B) or 8 billion (8B) will return responses in less then 2 seconds.
This webpages has a video that shows how you can download GGUF models from huggingface
How to download AI models in GGUF format:
http://inside.ozeki.hu/p_8456-download-ai-models-from-huggingface.html
If you save your downloaded. GGUF model files into the following directory, Ozeki AI studio will find them immediately:
C:\AIModels
To start with the AI models release by Facebook (Meta) or by the French team called Mistral will provide great results. We recommend to search for "meta-llama" on Huggingface.
Step 2 - Setup the AI model in Ozeki AI studio
Once the model is downloaded the next step is to set it up in Ozeki AI studio. Click on AI studio in the Ozeki desktop (Figure 2), then select "AI models" in the toolbar and click on "Create new AI model" (Figure 3).
This will bring up a form where you can select an AI model type. Select GGUF from the list. (Note that this type list might change as Ozeki releases new Ozeki AI versions.) After GGUF is selected you will be presented with the model's configuration form (Figure 4).
Note: If you run the system on an NVidia GPU, go to the HW tab on this form, and increase the GPU layers parameter to 130.
For NVidia GPUs you must also install the NVidia CUDA toolkit, and you must set the AI execution model to NVidia in the settings.
The model name can be typed in manually or it can be selected from the drop-down list. The drop-down list will include files it finds in the C:\AIModels directory. This is why we recommend you to save your model files to this location.
Step 3 - Create a Chat Bot with your AI model
Once your model is setup and ready all you have to do is configure it in an AI chatbot. In the Ozeki system, an AI chatbot is responsible for managing chat sessions with multiple users and maintaining independent chat history for each user. Multiple chatbots can use the same AI model, which is great because you don't have to load the module multiple times into the memory (RAM, and you don't have to operate a different GPU for each chatbot. This is one of the strengths of Ozeki.
To configure a chatbot to use your newly downloaded AI model, you must select "Chatbots" in the toolbar (Figure 5) and you must click "Create new Chatbot" in the top. In the chatbot type selector select AI chat (figure 5).
Once the chatbot type is selected, you can give your chatbot a name, and you can tell it which AI model to use (Figure 6). You can also customize the system prompt and other settings for your bot on this cofniguration form.
After hitting OK, your chatbot is ready to use. It might take a few seconds for it to load your model, but once it's loaded you are ready to chat with your local LLM (Figure 7). You will love this great AI tool, and you will like the privacy it gives when you put it to work.
Summary
In conclusion, utilizing a local AI model in GGUF format with Ozeki AI Studio offers a great solution for managing AI-driven interactions. By following the steps to download, set up, and configure your AI model, you can seamlessly integrate advanced AI capabilities into your communication systems. This setup not only enhances efficiency and response times but also ensures data privacy and security by keeping sensitive information within your local network.
With the ability to handle multiple chat sessions and maintain independent chat histories, Ozeki AI Studio provides a powerful and flexible platform for leveraging AI in various applications.
More information
- How to download GGUF AI models from Huggingface
- How to use a local GGUF AI model in Ozeki AI Chat
- How to use ChatGPT in Ozeki AI Chat
- How to use vLLM AI models
- How to setup an AI pipeline
- How to use the Speech To Text model
- Random