Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.
Cost / License
- Free Personal
- Open Source
Application types
Platforms
- Windows
- Online
- Self-Hosted
Ollama is described as 'Facilitates local deployment of Llama 3, Code Llama, and other language models, enabling customization and offline AI development. Perfect for creating personalized AI chatbots and writing tools' and is a very popular large language model (llm) tool in the ai tools & services category. There are more than 50 alternatives to Ollama for a variety of platforms, including Mac, Windows, Linux, Android and Web-based apps. The best Ollama alternative is DeepSeek, which is both free and Open Source. Other great apps like Ollama are Jan.ai, AnythingLLM, Ensu and Alpaca - Ollama Client.
Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.
AI00 RWKV Server is an inference API server for the RWKV language model based upon the web-rwkv inference engine.
Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs.




Learn how to add AI with local models and APIs to Windows apps. Discover AI scenarios and models such as Phi, Mistral, Stable Diffusion, Whisper, and many more to delight your users. The AI Dev Gallery is an open-source app designed to help Windows developers integrate AI...








Experience the power of RWKV models directly on your device. Completely offline, privacy-first, and efficient. No internet required.








This application provides a full suite of generative AI features for chat, code assistance, document search, image analysis, image and video generation. All features run offline and are powered by your PC’s Intel® Core™ Ultra with built-in Intel Arc GPU or Intel Arc™ dGPU...


This project aims to eliminate the barriers of using large language models by automating everything for you. All you need is a lightweight executable program of just a few megabytes. Additionally, this project provides an interface compatible with the OpenAI API, which means...




A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports both GPU and CPU modes, with special optimizations for macOS Apple Silicon and enterprise deployment on OpenShift/Kubernetes.





Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level.
