llama.cpp icon
llama.cpp icon

llama.cpp

The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud.

Open and start using the WebUI in your browser

Cost / License

  • Free
  • Open Source (MIT)

Platforms

  • Windows
  • Mac
  • Linux
  • Docker
  • Homebrew  brew install llama.cpp
  • Nix Package Manager  nix profile install nixpkgs#llama-cpp
  • MacPorts  sudo port install llama.cpp
  • Self-Hosted
4likes
1comment
0articles

Features

Properties

  1.  Lightweight
  2.  Privacy focused
  3.  Minimalistic

Features

  1.  Works Offline
  2.  Hardware Accelerated
  3.  No Tracking
  4.  No registration required
  5.  GPU Acceleration
  6.  AI Chatbot
  7.  AI-Powered
  8.  Support for NVIDIA CUDA acceleration
  9.  Apple Metal support

llama.cpp News & Activities

Highlights All activities

Recent activities

  • ChatGPT icon
    3F1 added llama.cpp as alternative to ChatGPT
  • LocalFreedom reviewed llama.cpp  

    I like llama.cpp because it is the first and best open source application for LLM. It is not only the core engine of GPT4ALL, LM Studio, Jan, Ollama and so on, but also a friendly application with WebUI and router mode.

  • LocalFreedom added WebUI as a feature to llama.cpp
  • LocalFreedom, playermet and 3F1 liked llama.cpp
  • Operit AI icon
    OrdinaryPerson added llama.cpp as alternative to Operit AI
  • bugmenot liked llama.cpp
  • LLM Hub icon
    bugmenot added llama.cpp as alternative to LLM Hub
  • NexaSDK icon
    bugmenot added llama.cpp as alternative to NexaSDK

llama.cpp information

  • Developed by

    BG flagggml-org
  • Licensing

    Open Source (MIT) and Free product.
  • Written in

  • Alternatives

    27 alternatives listed
  • Supported Languages

    • English

AlternativeTo Categories

AI Tools & ServicesSystem & Hardware

GitHub repository

  •  106,529 Stars
  •  17,356 Forks
  •  1540 Open Issues
  •   Updated  
View on GitHub

Popular alternatives

View all

Our users have written 1 comments and reviews about llama.cpp, and it has gotten 4 likes

llama.cpp was added to AlternativeTo by bugmenot on and this page was last updated .

Comments and Reviews

   
David
0

I like llama.cpp because it is the first and best open source application for LLM. It is not only the core engine of GPT4ALL, LM Studio, Jan, Ollama and so on, but also a friendly application with WebUI and router mode.

Featured in Lists

Wake up the NPU on your device

List by bugmenot with 46 apps, updated

What is llama.cpp?

The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud.

  • Plain C/C++ implementation without any dependencies
  • Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
  • AVX, AVX2, AVX512 and AMX support for x86 architectures
  • RVV, ZVFH, ZFH, ZICBOP and ZIHINTPAUSE support for RISC-V architectures
  • 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
  • Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads GPUs via MUSA)
  • Vulkan and SYCL backend support
  • CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity

The llama.cpp project is the main playground for developing new features for the ggml library.

Official Links