mnemophonix
A simple audio fingerprinting system which can index audio files and try to identify audio files against previously built database.
mnemophonix
Features
- Command line interface
- Acoustic fingerprinting
Tags
- music-fingerprint
mnemophonix News & Activities
Recent activities
mnemophonix information
What is mnemophonix?
From README.md:
This project was inspired by the article https://www.codeproject.com/Articles/206507/Duplicates-detector-via-audio-fingerprinting by Sergiu Ciumac that explains how to build a Shazam-like system that can index audio files and then, given an audio file, try to identify it against the database previously built.
The work done here is a simplified version in C of this work. It is built from scratch without any dependency, since the main goal was to learn in details how audio fingerprinting works. It has lots of comments and it is only moderately optimized with multithreading to make the fingerprinting not too slow, but it tries to stay easy to understand. There is no attempt at storing the signatures in an optimized way, so if you want to use it at large scale, you will probably need to customize the I/O.
The canonical format that the program can process is 44100Hz 16-bit PCM. However, for any input file that does not look like this, the program will attempt a conversion on-the-fly with the best Swiss army knife media tool: ffmpeg, so any file with audio (including videos) can be fingerprinted.
For the record, fingerprinting the 130 songs from DEFCON 20 to DEFCON 27 generates a 75Mb database. Fingerprinting a 2 hour movie produces a 16Mb signature and takes about 55 seconds on a MacBook Pro (including extracting the audio from the movie with ffmpeg). Once the database is loaded in memory, searching for an audio sample of a few seconds is almost instantaneous.
