FastFlowLM icon
FastFlowLM icon

FastFlowLM

Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.

Cost / License

  • Free Personal
  • Open Source

Platforms

  • Windows  Windows 11
  • Online  [https://open-webui.testdrive-fastflowlm.com/](https://open-webui.testdrive-fastflowlm.com/)
  • Self-Hosted
-
No reviews
0likes
0comments
0news articles

Features

Suggest and vote on features

Properties

  1.  Lightweight
  2.  Privacy focused

Features

  1.  Command line interface
  2.  Works Offline
  3.  AI Chatbot
  4.  Agentic AI
  5.  AI-Powered
  6.  AMD
  7.  Offline

 Tags

  • local-ai
  • AI Agent
  • AI-agents
  • llama
  • npu
  • retrieval-augmented-generation
  • llm-inference
  • whisper-ai
  • deepseek

FastFlowLM News & Activities

Highlights All activities

Recent activities

FastFlowLM information

  • Developed by

    US flagFastFlowLM
  • Licensing

    Open Source and Free Personal product.
  • Written in

  • Alternatives

    12 alternatives listed
  • Supported Languages

    • English

AlternativeTo Categories

AI Tools & ServicesOS & Utilities

GitHub repository

  •  696 Stars
  •  35 Forks
  •  22 Open Issues
  •   Updated  
View on GitHub

Popular alternatives

View all
FastFlowLM was added to AlternativeTo by bugmenot on and this page was last updated .
No comments or reviews, maybe you want to be first?

What is FastFlowLM?

? FastFlowLM (FLM) — Unlock Ryzen™ AI NPUs

Run large language models — now with Vision, Audio, Embedding and MoE support — on AMD Ryzen™ AI NPUs in minutes. No GPU required. Faster and over 10× more power-efficient. Supports context lengths up to 256k tokens. Ultra-Lightweight (16 MB). Installs within 20 seconds.

📦 The only out-of-box, NPU-first runtime built exclusively for Ryzen™ AI. 🤝 Think Ollama — but deeply optimized for NPUs. ? From Idle Silicon to Instant Power — FastFlowLM Makes Ryzen™ AI Shine.

FastFlowLM (FLM) supports all Ryzen™ AI Series chips with XDNA2 NPUs (Strix, Strix Halo, and Kraken).

🧠 Local AI on NPU

FLM makes it easy to run cutting-edge LLMs (and now VLMs) locally with:

  • ? Fast and low power
  • 🧰 Simple CLI and API (REST and OpenAI API)
  • 🔐 Fully private and offline

No model rewrites, no tuning — it just works.

? Highlights

  • Runs fully on AMD Ryzen™ AI NPU — no GPU or CPU load
  • Lightweight runtime (16 MB) — installs within 20 seconds, easy to integrate
  • Developer-first flow — like Ollama, but optimized for NPU
  • Support for long context windows — up to 256k tokens (e.g., Qwen3-4B-Thinking-2507)
  • No low-level tuning required — You focus on your app, we handle the rest

FastFlowLM Videos

Official Links