llama.cpp Alternatives

llama.cpp is described as 'The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud' and is a large language model (llm) tool in the ai tools & services category. There are more than 25 alternatives to llama.cpp for a variety of platforms, including Windows, Linux, Mac, Android and iPhone apps. The best llama.cpp alternative is Ollama, which is both free and Open Source. Other great apps like llama.cpp are GPT4ALL, Jan.ai, AnythingLLM and LM Studio.

Copy a direct link to this comment to your clipboard
llama.cpp alternatives page was last updated

Alternatives list

  1. RWKV Chat icon
     2 likes

    Experience the power of RWKV models directly on your device. Completely offline, privacy-first, and efficient. No internet required.

    Cost / License

    Platforms

    • Mac
    • Windows
    • Linux
    • Android
    • iPhone
    • iPad
    • Android Tablet
     
  2. AI Playground icon
     2 likes

    This application provides a full suite of generative AI features for chat, code assistance, document search, image analysis, image and video generation. All features run offline and are powered by your PC’s Intel® Core™ Ultra with built-in Intel Arc GPU or Intel Arc™ dGPU...

    Cost / License

    • Free
    • Open Source (MIT)

    Platforms

    • Windows
     
  3. Learn how to add AI with local models and APIs to Windows apps. Discover AI scenarios and models such as Phi, Mistral, Stable Diffusion, Whisper, and many more to delight your users. The AI Dev Gallery is an open-source app designed to help Windows developers integrate AI...

    Cost / License

    • Free
    • Open Source (MIT)

    Platforms

    • Windows
     
  4. AI00 RWKV Server is an inference API server for the RWKV language model based upon the web-rwkv inference engine.

    Cost / License

    • Free
    • Open Source (MIT)

    Platforms

    • Windows
    • Mac
    • Linux
    • Rust
     
  5. Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs.

    Cost / License

    Platforms

    • Windows
    • Linux
    • Docker
    • Snapcraft
    • iPhone
    • iPad
    • Self-Hosted
    • Python
     
  6. RWKV Runner icon
     2 likes

    This project aims to eliminate the barriers of using large language models by automating everything for you. All you need is a lightweight executable program of just a few megabytes. Additionally, this project provides an interface compatible with the OpenAI API, which means...

    Cost / License

    • Free
    • Open Source (MIT)

    Platforms

    • Mac
    • Windows
    • Linux
    • Self-Hosted
    • Python
     
  7. A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports both GPU and CPU modes, with special optimizations for macOS Apple Silicon and enterprise deployment on OpenShift/Kubernetes.

    Cost / License

    Platforms

    • Python
    • Mac
    • Linux
    • Self-Hosted
    • Kubernetes
    • OpenShift
     
  8. Operit AI icon
     1 like

    📱 The first fully functional, standalone AI assistant for mobile devices with powerful tool-calling capabilities 📱

    70 Operit AI alternatives

    Cost / License

    • Free
    • Open Source

    Platforms

    • Android
     
  9. FastFlowLM icon
     1 like

    Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.

    Cost / License

    • Free Personal
    • Open Source

    Platforms

    • Windows
    • Online
    • Self-Hosted
     
You are at page 2 of llama.cpp alternatives