Apps tagged with 'multimodal'

All apps in Apps tagged with 'multimodal' category. Use the filters below to narrow down your search. 
Copy a direct link to this comment to your clipboard
  1. Janus icon
     17 likes

    Autoregressive models that unify multimodal tasks, surpassing specialized models with visual path decoupling, autoregressive integration, and flexible design.

    Cost / License

    • Free
    • Open Source (MIT)

    Application type

    Platforms

    • Self-Hosted
    • Python
    Janus screenshot 1
    Janus screenshot 1
    Janus screenshot 2
    69 alternatives
  2. Trae icon
     15 likes

    Trae is an adaptive AI IDE that transforms how you work, collaborating with you to run faster.

    Cost / License

    • Freemium
    • Proprietary

    Application types

    Platforms

    • Mac
    • Windows
    Trae screenshot 1
    Trae screenshot 1
    Trae screenshot 2
    +1
    Trae screenshot 3
    41 alternatives
  3. Pipecat icon
     10 likes

    Pipecat is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, story-telling toys for kids, customer support bots, intake flows, and snarky social companions.

    Cost / License

    Application type

    Platforms

    • Self-Hosted
    • Python
    Pipecat screenshot 1
    Pipecat screenshot 1
    Pipecat screenshot 2
    +1
    Pipecat screenshot 3
    31 alternatives
  4. Nekro Agent icon
     2 likes

    An Extensible Multi-person interactive Agent Framework Powered by LLM Code Generation; Support: QQ, Discord, Minecraft, Bilibili Live, SSE(SDK) ...

    Cost / License

    • Free
    • Open Source

    Platforms

    • Linux
    • Windows
    • Mac
    • Docker
    • Self-Hosted
    • Docker Hub
    • Python
    Nekro Agent screenshot 1
    Nekro Agent screenshot 1
    Nekro Agent screenshot 2
    +3
    Nekro Agent screenshot 3
    33 alternatives
  5. Marqo icon
     3 likes

    Marqo is more than a vector database, it's an end-to-end vector search engine. Vector generation, storage and retrieval are handled out of the box through a single API. No need to bring your own embeddings.

    Cost / License

    Platforms

    • Software as a Service (SaaS)
    • Self-Hosted
    Marqo screenshot 1
    Marqo screenshot 1
    21 alternatives
  6. Gai icon
     2 likes

    Gai is a beginner-friendly AI toolkit with no ads, no registration, and no other permissions required, except for Internet.

    Cost / License

    • Free
    • Proprietary

    Application type

    Platforms

    • Windows
    • Mac
    • Linux
    • Flathub
    • Flatpak
    • iPhone
    • iPad
    • Android
    • Android Tablet
    Gai screenshot 1
    Gai screenshot 1
    Gai screenshot 2
    17 alternatives
  7. llama.cpp icon
     1 like

    The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud.

    Cost / License

    • Free
    • Open Source (MIT)

    Platforms

    • Windows
    • Mac
    • Linux
    • Docker
    • Homebrew
    • Nix Package Manager
    • MacPorts
    • Self-Hosted
    Open and start using the WebUI in your browser
    Add multiple text files from disk or from the clipboard to the context of your conversation
    Attach one or multiple PDFs to your conversation. By default, the contents of the PDFs will be converted to RAW text, excluding any visuals.
    +10
    Optionally, the WebUI can process the PDFs as images when the AI model supports it.
    27 alternatives
  8. Fenn icon
     1 like

    Fenn is a powerful, AI-driven desktop search engine for macOS that makes your files instantly searchable—including videos, audio, PDFs, Word documents, Excel sheets, and images. Just type or upload an image to find exactly where any object, person, or concept appears.

    Cost / License

    • Paid
    • Proprietary

    Platforms

    • Mac
    Fenn multimodal search
    61 alternatives
  9. Cognigy.AI icon
     1 like

    Cognigy.AI is the Conversational AI Platform focused on the needs of large enterprises to develop, deploy and run Conversational AIs on any conversational channel.

    Cost / License

    • Paid
    • Proprietary

    Platforms

    • Online
    • Self-Hosted
    Graphical Conversation Editor with Interaction Panel in Chrome
    > Powerful Code Editor, based on Visual Studio Code
> Full Intellisense for JavaScript
> Full Intellisense for Cognigy Input, Actions and Profiles
> Full error handling directly in the browser
    > Persistent Contact Profiles store customer profile information
> Can be used for personalization and more
>GDPR Compliance: Profile can be exported, deleted, turned off
    +1
    > Rich analytics directly in Cognigy.AI
> See metrics on flow usage and on missunderstood user inputs to optimize flows
> Connect to external BI systems through OData
  10. OmniSVG icon
     Like

    OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from simple icons to intricate anime characters.

    Cost / License

    Platforms

    • Self-Hosted
    OmniSVG screenshot 1
    OmniSVG screenshot 1
    OmniSVG screenshot 2
  11. Run and fine-tune generative AI models with easy-to-use APIs and highly scalable infrastructure. Train and deploy models at scale on our AI Acceleration Cloud and scalable GPU clusters. Optimize performance and cost.

    Cost / License

    • Paid
    • Proprietary

    Platforms

    • Online
    • Software as a Service (SaaS)
    Together AI screenshot 1
    21 alternatives
  12. Morphik icon
     Like

    Store, search, and query multi-modal data with fine-grained access control and built-in security. Build AI applications with confidence and speed.

    Cost / License

    • Freemium
    • Open Source

    Platforms

    • Python
    • Online
    • Self-Hosted
    Morphik screenshot 1
    3 alternatives
  13. DataChain icon
     Like

    DataChain builds a suite of tools for data preprocessing and management, experiment tracking, ML models versioning, and pipeline automation.

    Cost / License

    Platforms

    • Python
    • Online
    • Software as a Service (SaaS)
    • Self-Hosted
    8 alternatives
  14. BAGEL AI icon
     Like

    Open-source multimodal model with 7B active parameters for tasks like text-to-image, image editing, visual manipulation, multiview synthesis, and world navigation.

    Cost / License

    Platforms

    • Self-Hosted
    • Python
    BAGEL AI screenshot 1
    53 alternatives
  15. ANUS icon
     Like

    Anus (Autonomous Networked Utility System) is a powerful, flexible, and accessible open-source AI agent framework designed to revolutionize task automation. Built with modern AI technologies and best practices, Anus represents the next generation of AI agent frameworks, offering...

    Cost / License

    Application type

    Platforms

    • Windows
    • Mac
    • Linux
    ANUS screenshot 1
    ANUS screenshot 1
    31 alternatives
  16. Reka icon
     Like

    Reka.ai is a multimodal AI platform that builds advanced models from scratch, enabling agents that can see, hear, and reason across text, images, audio, and video—deployable anywhere from lightweight devices to enterprise systems.

    Cost / License

    • Free
    • Proprietary

    Platforms

    • Online
    Reka screenshot 1
    Reka screenshot 1
    Reka screenshot 2
    +1
    Reka screenshot 3
    14 alternatives
  17. TEN Agent icon
     Like

    The TEN Framework is an open-source framework that enables developers to quickly build real-time multimodal agents (voice, video, data stream, image and text), making it easy for developers to experiment, integrate large language models, and create reusable extensions.

    Cost / License

    • Free
    • Open Source

    Platforms

    • Self-Hosted
    • Docker