Run Large Language Models Locally

Run AI chats locally with your own data. Introduce Ozeki AI technology in your business to make your team more efficient by helping them with smart AI chat bots trained on information specific to their jobs. Setup AI for Customer Support, HR or to support the productive work of the Marketing Department. Or simply setup a service that helps in answering e-mails.

Quick start

Follow the instructions in our quick start guide and setup your system in 10 minutes.

Ozeki AI Quick Start Guide

Ozeki AI Agents

With Ozeki AI you can create many local services for your business. For example, you can connect a Large Language Model (LLM) to an E-mail client, or you can configure AI to answer phone calls through a VoIP phone extension. You may also use AI to help in analysing websites. Check out the AI agents Ozeki AI offers:

What is Ozeki AI

Ozeki AI is a local AI chat system, that allows you to run chat bots based on Large Language Models locally. It is similar to Chat GPT, but it is local. It runs on your computer. You can create an AI chat bot, and you can customize it to give accurate answers based on information available in your business. You can train your local AI model using information in files, databases, CRM systems or websites to generate accurate responses. Your team can use your AI chat systems from mobile devices and laptops through an installed app or through a browser (Figure 1).

Ozeki AI can be pictured as your private Chat GPT.

Figure 1 - Local LLM and AI execution

Ozeki AI can use local LLMs and act like Chat GPT. You can make it more advanced if you create an AI pipeline with multiple prompts to return even better responses. You can also connect it to on-line AI models, such as Chat GPT, Copilot and others.

Why is Ozeki AI better

Ozeki AI offers significant advantages over on-line AI services, since it runs inside your local network. It is a software product you install on your Windows computer and you can run private AI models with it. This allows you to train your models with local information and to put relevant data into the short-term memory (called context window) of the agent to help it give relevant responses.

The biggest problem of online AI services, such as ChatGPT or Microsoft CoPilot is that they always give the same generic answers. Ozeki AI can add relevant knowledge to the answers, thus making AI actually usable. It can do this because it runs on-premises. You can train your local AI models with company knowledge, and you can add customer specific or job specific details to the short-term memory (context windows) when AI queries are answered. This way your knowledge and data will stay private, and you will be able to return proper, high-quality responses.

Ozeki AI models

Ozeki is a multi-model AI system. It allows you to run open-source AI models locally, use on-line AI services, such as ChatGPT or Microsoft Copilot and it offers to create an AI pipeline and perform multi-stage AI evaluation and decision making.

# Model Description
1 Local AI models A local AI model is an artificial intelligence system that runs directly on a user’s local hardware, such as a personal computer or server, rather than relying on cloud-based services. This setup allows for greater control over data privacy and security, as all processing and data storage occur locally. Local AI models do not require an internet connection. This makes them ideal for applications where data sensitivity and offline functionality are crucial. Ozeki AI makes it possible to run local AI models on general purpose CPUs and Nvidia GPUs.
2 On-line AI models An online AI service is a cloud-based platform that provides artificial intelligence capabilities over the internet. These services allow users to access and utilize AI models through an on-line interface called API. Online AI services can perform a wide range of tasks, such as natural language processing, image recognition, data analysis, and more. On-line AI services require you to purcahse an API key and you typically will pay for each query you make to the system. Ozeki AI allows you to use On-line AI services, such as ChatGPT along side Local AI execution.
3 vLLM AI models A vLLM AI model is an AI model that runs online but is private. A vLLM model can be installed on a Linux server in a private LAN or on a Virtual Private Server (VPS). It can perform similar functionality to online AI models but serves only a single entity (your organization). A vLLM can be accessed through a similar API that you can use to access an online AI service. However, in this case, you don't pay for each query. You will have costs for running the server, such as electricity or hosting costs. Ozeki AI can give you full access to vLLM AI services.
3 AI pipelines An AI pipeline is an Ozeki AI delicacy. It allows you to build a chain of AI models to break down a problem into smaller pieces and to give you a more accurate answer. You can mix local AI models, online AI models and vLLMS. The idea is to use the output of one AI prompt as input for the next. Learn how you can setup an AI pipeline with Ozeki AI.

What is an AI pipeline

Ozeki AI also gives you the ability to setup an AI pipeline to make your AI even smarter. In the next paragraph you can find out what is an AI pipeline and how you can set one up.

The Ozeki AI pipeline is a chain of prompts. It can be setup to create better results. When a question comes goes to the AI pipeline, prompts are executed one after another to product a better response at the end. You can design your custom AI pipelines by writing multiple prompts and passing the output of one prompt to the next (Figure 2).

Figure 2 - Setup a Local AI pipeline for better answers

Ozeki AI can be used just like a standard AI model. It can be used to create smart AI chat bots, to generate AI e-mail responses, to analyse websites using AI, to talk with AI through a microphone, to conduct AI phone calls and to analyse websites using AI. All of this is done using your LLM.

Use multiple channels

Create custom AI chat bots, with tailored prompts and connect them to the communication channel of your choice (Figure 3). Add customized short-term memory (prepared context windows) and train the model with local information to answer customer chats, customer e-mails and voice calls appropriately.

Figure 3 - Process information in Chat, E-mail and Voice calls

Examples

  • Create a chat bot with pre-defined chat history + custom prompts to provide accurate mission specific responses
  • Train a model with data in your support chat database
  • Setup automated AI actions: create an AI chat bot that prepares suggested e-mail responses
  • Create a web analyser AI chat bot: this bot can download webpages and can answer questions based on the information found

Use local knowledge

By adding local knowledge to your local AI models, you can prepare your AI agents to support task execution better. The out-of-the box AI models come with general knowledge at a level of a high-school graduate. With training and context window preparation, you can improve this knowledge to make the AI model useful for your organization.

One good example for context window preparation, is when management designs custom chat bots for specific jobs. These chat bots can be prepared with a pre-written chat history, they can be setup with http URL-s containing relevant information related to the job, and preparation prompts can be configured that give prior instructions to the AI chat bot.

These "prepared" chat bots can be used by employees during task execution and they will get better results to the prior knowledge added to the system by management (Figure 4).

Figure 4 - Create custom chatbots with preparation prompts

Custom bots for custom jobs

A single chat bot cannot serve all the needs of an organization. Custom tasks require different preparation, different instructions and often different AI models (Figure 5). Ozeki Chat can run multiple chat bots simultaneously. Team members can use these chatbots from their mobile devices, laptops and desktop computers. As each chatbot is a friend in the contact list in Ozeki chat, a single person can communicate with multiple chat bots at the same time as their task demands it. For example, one chatbot can generate appropriate images, and another can generate text.

Figure 5 - Custom bots for custom jobs

Benefits of using Ozeki AI

An important benefit of Ozeki AI is that you will stay AI service provider independent, and you can run Large Language Models (LLMs) locally without the involvement of any 3rd party. Your data will stay private and safe, and you will not pass knowledge to public AI models. Although you can take advantage of public AI services with Ozeki AI, we recommend you to keep most of the knowledge and AI execution in house, because this way you can maintain competitive market advantage.

Private

Chats and AI knowledge stay private

Ozeki Chat and Ozeki AI can be setup to run locally without Internet. You can use your private chat server and private LLM with your sensitive data stored locally.

Secure

Use your own hardware

Ozeki allows you to run LLMs on CPUs and GPUs. It runs on Windows. It can use Intel or AMD CPU and provides great performance on NVidia GPU.

Smart

Use Files, CRMs and Databases

Add knowledge from your files, databases or IT systems, such as your HR or CRM to provide meaningful engagements through your AI chatbots.

How to run Large Language Models Locally

To run AI chats locally with your own data you need to Install Ozeki Chat Server. Ozeki Chat Server includes all the tools needed to create local AI chatbots. Ozeki Chat Server comes with clients for iPhone, Android and Windows and it can also be use from a web browser.

To run large language models (LLMs) locally first you have to download a generic open source model from hugging face. Once the AI model is downloaded, you need to setup your local AI chat bot. After an AI chat bot is created you can connect it to answer emails using AI, or to analyse websites or to do other useful tasks that Ozeki AI agents are prepared for.

How to download open-source AI models from the Web

At Hugging face, the central hub for AI model distribution, you can explore over 1000 open-source language models and take advantages of development and data training done by scientists, researchers and the open source community. You can download models in GGUF format and you can save them to your local hard drive to the C:\AIModels directory. Once a model is downloaded, you can start to chat with it immediately.

Open source, free to use AI models:

Where can I find and how can I download LLM models.

What hardware do I need to run AI locally

Ozeki AI can be used to run AI on any general-purpose CPU, this means, that you can run Ozeki AI on your Windows laptop or desktop computer. While this approach is great for evaluation and lower processing intensive tasks, it is best to equip your PC with an Nvidia GPU if you want to run AI models locally. Read the Ozeki AI GPU Guide to learn which GPU to use.

Figure 6 - NVIDIA GPU for faster responses

If you have an Nvidia GPU, you can execute larger, smart models, and get responses significantly fasters. GPUs provide the hardware to do large number of calculations simultaneously which is great help when running AI models. Ozeki AI gives full support to the latest NVidia technology. Install the NVidia CUDA toolkit and configure Ozeki AI to use your NVidia GPU.

Steps to take to start using Ozeki AI

AI performance improvement

  • Switch to NVIDIA GPU and set it up for Ozeki AI studio