bell notificationshomepageloginNewPostedit profiledmBox

Hoots : How to protect music against Shazam-recognition? During a presentation (like a quiz), I want to use some kind of music , but don't want listeners to detect them by sound-recognition programs (like Shazam or etc). Is there - freshhoot.com

10% popularity   0 Reactions

how to protect music against Shazam-recognition?
During a presentation (like a quiz), I want to use some kind of music , but don't want listeners to detect them by sound-recognition programs (like Shazam or etc).

Is there any plugin, media-player, equalizer effect or anything,
so listeners(humans) cant detected it (but as the same time, the music should practically remain same to human ears)..


Load Full (3)

Login to follow hoots

3 Comments

Sorted by latest first Latest Oldest Best

10% popularity   0 Reactions

Many popular music radio stations (Contemporary Hit Radio) pitch music up as a standard, sometimes anywhere from +1 to +3 percent, so I would think the recognition would be designed to handle quite a bit of time or pitch error.


10% popularity   0 Reactions

Keep same tempo and Try to Change key to +1.5 or +2. That should fool Shazam.


10% popularity   0 Reactions

In short, as @Some_Guy suggested, making a small time dilation might work.

Shazam made music recognition possible by generating "sound fingerprints" of small portions of an audio segment, and compare these fingerprints with those recorded in its database. To get rid of the recognition system, you have to make sure that any audio segment with the same fingerprints does not exist on Shaman's database. So it's either 1) Shazam doesn't have the music piece or any music with similar portions to yours; or 2) Shazam has the music piece, but somehow you managed to fuzz it so Shazam is obfuscated.

To generate the "sound fingerprints" of an audio clip, Shazam starts by splitting it into many short segments. By saying "short", we mean about a hundredth or thousandth of a second. Then for each short segment, Shazam picks out the most significant frequency ranges in it. The frequency ranges are then encoded (or in a programming term, hashed) and stored, along with a timestamp of this short segment. The hash and the timestamp together we call it a sample. Thouthands of samples are generated and then sent to Shazam's server, to find a matching piece of music in its vast database with billions of samples generated from millions of music pieces.

The matching process is quite straight-forward. Since the most significant frequency ranges of each sample are hashed, what Shazam tries to do is to find samples with the same hash values, and see whether these found samples can be timeline-aligned with those samples to match. A certain hash value can appear in many music pieces, but it's less possible for a sequence of hash values to appear in two different music pieces. The longer the sequence is, the less possible a hash-conflict might happen.

Some more facts about Shazam's music recognition algorithm:

The timeline aligning need not to be perfect. It doesn't matter if some of the samples does not match. Shazam scores all possible music pieces, and pick the one with the highest score as the result. As stated in their paper, from a heavily corrupted audio clip, they can find a match with only about 1-2% effective samples. So it won't work if you corrupting part of the music (which is also not acceptable to OP's requirement).
Shifting the pitch of the music might work, but since the algorithm is designed to calculate against frequency ranges instead of exact frequency points, you have to change the pitch by a large magnitude, which is definitely recognizable by the audiences. There are also some other filters which might be able to cheat Shazam, but it's still impossible to do that with an imperceptible change.
Shazam's algorithm can extract the transparency of multiple tracks mixed in an audio clip, so if you are mixing a Shazam-recognizable music with other sound tracks, it's still detectable.

It seems Shazam has a very strong algorithm which is impeccable, however there is one advantage we can use, the accuracy of time. A matching in Shazam is determined by an exact timeline alignment. With this characteristic, Shazam can even distinguish between two versions of a same song, or tell whether a singer is faking his live with lip synchronization. The system is accurate down to milliseconds, so a small amount of time dilation can wreck it down.

That being said, it's still technically possible to handle these kind of obfuscation, especially if it's a linear one, and if you are not facing Shazam, which is designed to recognize exactly matching musics. Not to mention Shazam's algorithm is initially published in 2003. 14 years later, there are much stronger algorithms to detect music similarities, such as machine learning.

You can read Shazam's paper to learn more about its algorithm.


Back to top Use Dark theme