SAM Audio

1 like

SAM-Audio is a foundation model for isolating any sound in audio using text, visual, or temporal prompts. It can separate specific sounds from complex audio mixtures based on natural language descriptions, visual cues from video, or time spans.

SAM Audio screenshot 1

Cost / License

Free
Open Source

Origin

United States

Platforms

Online
Self-Hosted
Python

SAM Audio screenshot 1

SAM Audio screenshot 2

+2

SAM Audio screenshot 3

SAM Audio alternatives

1like

0comments

Features

No features, maybe you want to suggest one?

Tags

SAM Audio News & Activities

Highlights All activities

Recent News

POX published news article about SAM Audio
3 months ago
Meta launches SAM Audio, an AI model for intuitive sound segmentation and isolation
Meta has launched SAM Audio, a state-of-the-art artificial intelligence model that brings advanced ...
Dec 18, 2025

Recent activities

PredatorQ liked SAM Audio
3 months ago
POX updated SAM Audio
3 months ago
POX added SAM Audio as alternative to Ultimate Vocal Remover GUI, moises.ai, Lalal.ai and Spleeter + 22 similar activities
3 months ago
POX added SAM Audio
3 months ago

SAM Audio information

Developed by
Meta
Licensing
Open Source and Free product.
Written in
Python
Alternatives
27 alternatives listed
Supported Languages
- English

AlternativeTo Category

AI Tools & Services

GitHub repository

3,413 Stars
300 Forks
44 Open Issues
Updated Jan 5, 2026

Popular alternatives

SAM Audio was added to AlternativeTo by Paul on Dec 17, 2025 and this page was last updated Dec 17, 2025.

No comments or reviews, maybe you want to be first?

What is SAM Audio?

SAM-Audio is a foundation model for isolating any sound in audio using text, visual, or temporal prompts. It can separate specific sounds from complex audio mixtures based on natural language descriptions, visual cues from video, or time spans.

SAM-Audio supports three types of prompting: text, visual, and span. Each method allows you to specify which sounds to isolate in different ways.

Text Prompting: Use natural language descriptions to isolate sounds.
Visual Prompting: Isolate sounds associated with specific visual objects in a video using masked video frames.
Span Prompting (Temporal Anchors): Specify time ranges where the target sound occurs or doesn't occur. This provides a specific example to the model of what to isolate.

Official Links

AppStores & Other Links

Social Networks