Auto correcting a slightly off pitch recording?
Suppose I have a recording of a live concert of a known band. We can assume that the concert was played using instruments that were in tune, in standard tuning. This was in the 1990s, so the concert was recorded to an analog cassette tape (or a DAT master) then distributed to collectors via tape trading. This means that by the time a recording has reached me, it has undergone a few generations of copying from one tape deck to another (dubbing on a double-deck player was typically frowned upon) and there is no guarantee that both tape decks ran at precisely the same speed when copying - quite the contrary was often the case - and the error could accumulate with each generation. This means that the tape in my hands is almost certainly off pitch by a certain degree. Now that I've converted the tape to a digital file I want to repair the error.
My typical method for doing this is to find a song I know how to play, pick up a tuned instrument and play along, tweaking the playback speed until the song matches my instrument. When I find the correct speed change ratio I apply it to the audio file and save the result.
This is great for fixing one or two recordings, not so much when I have a hundred of them.
I think it's theoretically possible for an algorithm to analyze a recording and tell whether it's off pitch, assuming standard tuning. Suppose my recording is sharp by 0.2 semitones the algorithm should be able to suggest a correction of -1.2/-0.2/+0.8/+1.8 (obviously it can't guess which direction is correct, nor how far we are from the original pitch). Googling for pitch correction software invariably leads me to auto-correct plugins for vocal tracks, which is really not what I'm looking for.
Using my fairly strong C++ but very limited understanding of the mathematics of digital signal processing, I tried to write a program that parses the output of sox song.wav -n stat -freq (which performs a DFT on the audio and outputs the results), finding the dominant frequencies and checking whether they match the frequencies for standard notes or deviate from them by a fixed ratio, but I was unable to extract meaningful results - perhaps because the output from sox is rather coarse. So here I am asking whether a tool exists that already implements such an algorithm, or any tips on doing this myself (would I need to perform the DFT using some software library or is the data from sox sufficient? etc), or whether this is a far more complex problem than I imagine, not solvable using simple heuristics.
1 Comments
Sorted by latest first Latest Oldest Best
This is not a full answer. It is perhaps a frame challenge, but it's also just a 'talking-point' kick off that may eventually lead to an answer to a question not yet voiced… bear with me…
Even before we get to fixing the pitch on a digital copy, I'd have several issues to address first… generational loss is absolutely not restricted to speed; azimuth loss is going to be massive, as are EQ 'changes' multiplied & divided.
So… if all you have is a multi-gen copy, the first thing you should do is 'sacrifice' a good cassette player to the cause.
You need a cassette deck you can clip the front face from, so you can reach the azimuth screw as it's playing. You then buy a low-mass tweaker to do this. You cannot do it with a metal screwdriver.
Tweak the azimuth by ear whilst playing the cassette.
This is not a perfect solution - but we're so far beyond perfect solutions at this point.
Next step is to get the case off the cassette deck, find the motor & measure the resistance across the 'speed pot' it has in it - these always have a tiny manual pot to set the speed, often right inside the motor-housing itself. They're bendy-metal clips, so the actual disassembly doesn't require much tech.
You can do this with just a multi-meter, screwdriver & soldering iron.
Measure the resistance across the entire pot, then at the current speed. Guess at values which will give you a better spread somewehere within this you can buffer the pot & that spread [bear with me again]
If your pot is 200? & at current speed you're at 70? and call it a 20? spread.
You can mess with the original pot to see what change produces a tone or two… factor this in.
Buy a new 20? high-spec pot & resistors of 60? & 120? [ you may need to add in series to achieve your final values].
Replace your pot with the new one - attractively wired out to the back panel, then buffer it with your make-up gains.
You now have about a tone or two with high confidence & better stability.
Play the tape, fix the azimuth & tweak the speed before you even record it.
If by now you're thinking this is completely & utterly mental, I have a 'Blue Peter' cassette deck exactly like this done for a very very similar project.
Did you know the buggers vari-speed for about the first hour of constant playback too? Well, now you do :P Start any session by playing all of both sides of a 'waste' 60min cassette, til it settles down… then clean the heads [again].
I sacrificed a NAD & a Nakamichi to do this. The Nakamichi was a better machine, overall, but the NAD turned out to behave better after the mods.
oh… & switch off Dolby. You're so far beyond it being of any assistance to accuracy that you ought to just take all the high end you can get, from source.
BTW, there is a piece of software that will take a lot of the grind out of this task… it costs 00 … yup. Celemony [of Melodyne fame] Capstan
Alternatively… check the record company didn't just officially release the bootlegs - so many of them have been now; they reckon they may as well do it officially & take their cut.
It would be rather sad to go to all this effort then discover it's already on iTunes, in vastly superior quality. [ref: most of the old 1970s Bowie bootlegs]
Terms of Use Privacy policy Contact About Cancellation policy © freshhoot.com2025 All Rights reserved.