AI base models
There are well known AI language models and multimodal systems, each designed for various natural language processing (NLP) tasks. Models like LLaMA (and its successors), Mistral, Falcon, and Baichuan are advanced transformers known for their performance across different domains. They range from smaller, efficient models like Mistral 7B to larger ones like GPT-NeoX and FalconMamba. Many models focus on specific languages (e.g., Vigogne for French) or regions (Chinese LLaMA/Alpaca). There are also multimodal models like LLaVA, which integrate language and vision, enabling richer AI interactions beyond text. The following list aims to point you to the best pages for these models.
Base models
- LLaMA 🦙
- LLaMA 2 🦙🦙
- LLaMA 3 🦙🦙🦙
- Mistral 7B
- Mixtral MoE
- DBRX
- Falcon
- Chinese LLaMA / Alpaca and Chinese LLaMA-2 / Alpaca-2
- Vigogne (French)
- BERT
- Koala
- Baichuan 1 & 2 + derivations
- Aquila 1 & 2
- Starcoder models
- Refact
- MPT
- Bloom
- Yi models
- StableLM models
- Deepseek models
- Qwen models
- PLaMo-13B
- Phi models
- GPT-2
- Orion 14B
- InternLM2
- CodeShell
- Gemma
- Mamba
- Grok-1
- Xverse
- Command-R models
- SEA-LION
- GritLM-7B + GritLM-8x7B
- OLMo
- OLMoE
- Granite models
- GPT-NeoX + Pythia
- Snowflake-Arctic MoE
- Smaug
- Poro 34B
- Bitnet b1.58 models
- Flan T5
- Open Elm models
- ChatGLM3-6b + ChatGLM4-9b
- SmolLM
- EXAONE-3.0-7.8B-Instruct
- FalconMamba Models
- Jais
- Bielik-11B-v2.3
- RWKV-6
Multimodal models
- LLaVA 1.5 models, LLaVA 1.6 models
- BakLLaVA
- Obsidian
- ShareGPT4V
- MobileVLM 1.7B/3B models
- Yi-VL
- Mini CPM
- Moondream
- Bunny
More information
- What tasks can AI solve better than Humans?
- How to find data for training AI models
- AI and LLM Terms and Definitions
- How Large Language Models (LLMs) work
- AI Architectures