Drop-In OpenAI replacement, On-device, local-first, Generate text/image/speech/music/etc... Backend Agnostic: (llama.cpp, diffusers, bark.cpp, etc...), Optional Distributed Inference(P2P/Federated).




oMLX is described as 'LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar' and is a large language model (llm) tool in the ai tools & services category. There are more than 50 alternatives to oMLX for a variety of platforms, including Windows, Linux, Mac, Web-based and Android apps. The best oMLX alternative is DeepSeek, which is both free and Open Source. Other great apps like oMLX are Ollama, Jan.ai, Ensu and AnythingLLM.
Drop-In OpenAI replacement, On-device, local-first, Generate text/image/speech/music/etc... Backend Agnostic: (llama.cpp, diffusers, bark.cpp, etc...), Optional Distributed Inference(P2P/Federated).




NodeTool is a playground for AI that uses a visual canvas to connect different AI tools - like GPT, image creators, and video generators - into one seamless workflow. Instead of jumping between five different apps to write a script, generate an image, and turn it into a video...


The Swiss Army Knife of offline AI. Chat, speak, and generate images. Privacy first, zero internet. Download an LLM and use it on your mobile device. No data ever leaves your phone.


MLC LLM is a machine learning compiler and high-performance deployment engine for large language models. The mission of this project is to enable everyone to develop, optimize, and deploy AI models natively on everyone’s platforms.




Experience AI chat on macOS with a SwiftUI-designed client utilizing Swift, CoreML, and BERT for native performance. Enjoy privacy-focused, intuitive chats with intelligent AI responses, profile customization, and full control via editable chat history and message rewind.

A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.

AI00 RWKV Server is an inference API server for the RWKV language model based upon the web-rwkv inference engine.
Experience the power of RWKV models directly on your device. Completely offline, privacy-first, and efficient. No internet required.








Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs.




This application provides a full suite of generative AI features for chat, code assistance, document search, image analysis, image and video generation. All features run offline and are powered by your PC’s Intel® Core™ Ultra with built-in Intel Arc GPU or Intel Arc™ dGPU...





