Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Reducing noise using a plurality of recording copies (Read 15555 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Reducing noise using a plurality of recording copies

I was pondering the usefulness of noise removal based on several copies of a recording, possibly from different physical copies. A Google search reveals that such a technique is already patented:

Quote
The present invention provides a method for reducing noise using a plurality of recording copies. The present invention produces a master file with lower noise than the available recording copies, and avoids the problems of losing musical content caused by prior art pop and click removers. The system comprises a recording playback unit, a computer system with a sound input capability, and a high capacity storage system such as a CD recorder. In operation, a plurality of recording copies of a single recording are played on the playback unit. These recordings are digitized by the computer and a separate recording file is formed for each copy of the recording. The recording files are then synchronized. The samples from each of the recording files are then averaged to reduce the noise components. A variety of threshold comparison techniques can be employed to eliminate samples and/or recording files that are outside of a computed range for that sample...


http://www.google.com/patents/US5740146

Do any of you know:

1) Does it work in practice?
2) Is software that does this available commercially or open source?
3) Are there available software/algorithms for synchronizing several audio streams (an essential step before comparing copies)?

Any pointers appreciated.

Reducing noise using a plurality of recording copies

Reply #1
I theory this can work.  Since the signal is coherent and the noise is random.  This  DOES work is when you combine the left & right channels of a stereo recording to make a lower-noise mono recording.    If my math is right, the signal increases by 6dB (With two signal sources) but the noise increases by only 3dB.  That increases your signal-to-noise ratio by 3dB (not much).    But in real-world situations with two different copies of a recording, I don't think this works....  The problem is the synchronzation which has to be "exact".  That means less than one wavelength at the highest frequency if analog, or sample-perfect if digital.

For example, if you record/digitize two copies of a vinyl record (or record the same record twice) and you time-align it at the beginning and mix the recordings, the time will drift slightly and you'll get nasty phasing/flanging/comb filtering effects.  And you won't get "digital alignment" or the same exact digital data both times because of analog variations, and because the samples will "taken" at a different point along the analog waveform every time you digitize.  Any audio editor can mix two or more recordings, so it's an easy thing to try for yourself.

If you start-out with two digital recordings, they should be identical.  In that case there is no information-increase by combining the two.    And if they are not identical, one is probably better than the other and you should use the best one.

You can probably use some DSP "tricks" to continuously time-align by time-streatching or time-squeezing one of the signals.  But once you've done that, I'd suggest that you'd get better results by selecting the best (lowest nose) recording moment-to-moment, rather than combining the good with the bad.  With this method, the synchronization wouldn't need to be perfect, just perceptually perfect.    In fact, Wave Repair has a feature something like that which allows you to copy a short section of one stereo channel into the opposite channel in order to eliminate a vinyl click/pop that occurs in only one channel, or that's time mis-aligned in the left & right channels.  (The loss of stereo for a fraction of a second is not noticeable as long as the sound blends "nicely".)

Reducing noise using a plurality of recording copies

Reply #2
I know it's done for pics.
If there are a constant visitor at a place,  and wish to photography the place without the visitors,
one trick is to take many pics , and then compute the median.
The feature is available in photoshop extended , I don't remind exact name.

Median works better than average, if you want to get rid of bad values.
The problem for audio, I guess  would be to synchronize all tracks.

Reducing noise using a plurality of recording copies

Reply #3
Do any of you know:

1) Does it work in practice?
2) Is software that does this available commercially or open source?
3) Are there available software/algorithms for synchronizing several audio streams (an essential step before comparing copies)?

Any pointers appreciated.


I don't know if it works in practice, but I've thought about potentially using it myself.

I have in the back of my mind that I could get access to at least two copies of an old LP released long ago by the group I work with, for which the original master tapes are not available, and from there attempt to make a cleaner digital version.

A certain amount of continuous surface noise is likely difficult to remove, but I was loosely thinking of a semi-manual approach including manual alignment of obvious transients near the start and end of each side of the LP (plus a couple of confirmational checks at other obvious features). I though of using automated click/pop-detection (but not repair) to merely identify problem locations and being able to mix/paste from the same samples of my aligned unaffected version wherever my preferred version has a click or pop.

Regarding synchronizing algorithms, there's mathematical cross-correlation, which you can look into if it has to be automatic. The highest peak of the cross-correlation function corresponds to the time offset. It's possible to synchronize to sub-sample accuracy if you upsample to higher sampling rate then downsample back after alignment.

I'd imagine few people would have multiple copies of more than one LP or tape and no better version, so the potential use of such software would be limited (at least in amateur circles). In professional circles, open reel tapes might be more common and multiple copies of the same audio perhaps rare.
Dynamic – the artist formerly known as DickD


Reducing noise using a plurality of recording copies

Reply #5
Thanks, I also figured that the sychronization is the issue here, if that is solved, various approaches can be used to determine the "best" signal.

The reason I started thinking about this approach is that similar techniques are applied for image processing, e.g. diffing noise exposure from night time exposure and SIFT for stiching etc. In essence, digital photography is also sampling, and identifying and aligning certain (possibly subsample) interest points should be possible for audio, too.

I actually tried manual alignment, and the results were not terrible, but not better than any of the originals, I'd say a bit worse.

After doing a little more research I found the MATCH plugin for Sonic Visualiser, which sort of does what I had in mind, but I haven't figured out how to process the aligned audio to get two aligned files:

This method is described in a 2005 paper by Simon Dixon:

Quote
Dynamic time warping finds the optimal alignment of two time series, but it is not suitable for on-line applications because it requires complete knowledge of both series before the alignment can be computed. Further, the quadratic time and space requirements are limiting factors even for off-line systems. We present a novel on-line time warping algorithm which has linear time and space costs, and performs incremental alignment of two series as one is received in real time. This algorithm is applied to the alignment of audio signals in order to follow musical performances of arbitrary length. Each frame of audio is represented by a positive spectral difference vector, emphasising note onsets. The system was tested on various test sets, including recordings of 22 pianists playing music by Chopin, where the average alignment error was 59ms (median 20ms). We demonstrate one application of the system: the analysis and visualisation of musical expression in real time.



I also found another paper by Pablo Sprechmann, Alex Bronstein, Jean-Michel Morel and Guillermo Sapiro that seems to do exactly what I had in mind, but I found no reference to any available software implementation:

Quote
A method for removing impulse noise from audio signals by fusing multiple copies of the same recording is introduced in this paper. The proposed algorithm exploits the fact that while in general multiple copies of a given recording are available, all sharing the same master, most degradations in audio signals are record-dependent. Our method first seeks for the optimal non-rigid alignment of the signals that is robust to the presence of sparse outliers with arbitrary magnitude. Unlike previous approaches, we simultaneously find the optimal alignment of the signals and impulsive degradation. This is obtained via continuous dynamic time warping computed solving an Eikonal equation. We propose to use our approach in the derivative domain, reconstructing the signal by solving an inverse problem that resembles the Poisson image editing technique. The proposed framework is here illustrated and tested in the restoration of old gramophone recordings showing promising results; however, it can be used in other applications where different copies of the signal of interest are available and the degradations are copy-dependent.


Reducing noise using a plurality of recording copies

Reply #6
Several replies while I was writing myself...

@bryant: Thanks for the pointer to previous discussion, I missed that when doing my initial search.

@Dynamic: The area of application I had in mind is for old shellac recordings where the masters are long gone, so more in the professional segment than for personal use.

Reducing noise using a plurality of recording copies

Reply #7
"
A method for removing impulse noise from audio signals by fusing multiple copies of the same recording is introduced in this paper. The proposed algorithm exploits the fact that while in general multiple copies of a given recording are available, all sharing the same master, most degradations in audio signals are record-dependent. Our method ?rst seeks for the optimal non-rigid alignment of the signals that is robust to the presence of sparse outliers with arbitrary magnitude. Unlike previous approaches, we simultaneously ?nd the optimal alignment of the signals and impulsive degradation. This is obtained via continuous dynamic time warping computed solving an Eikonal equation. We propose to use our approach in the derivative domain, reconstructing the signal by solving an inverse problem that resembles the Poisson image editing technique. The proposed framework is here illustrated and tested in the restoration of old gramophone recordings showing promising results; however, it can be used in other applications where different copies of the signal of interest are available and the degradations are copy-dependent."


The biggest practical problem I see is time-synching the plurality of recordings.  Presumably, the analog source recordings are digitized and the digital files are added together. To do this properly, the digitization needs to be done at several times the minimum appropriate Nyquist frequency in order to avoid creating a low pass digital filter.

Reducing noise using a plurality of recording copies

Reply #8
To do this properly, the digitization needs to be done at several times the minimum appropriate Nyquist frequency in order to avoid creating a low pass digital filter.


Why is that?  Making small adjustments to sampling rate shouldn't require much oversampling unless I misunderstand you.

Reducing noise using a plurality of recording copies

Reply #9
The biggest practical problem I see is time-synching the plurality of recordings.  Presumably, the analog source recordings are digitized and the digital files are added together. To do this properly, the digitization needs to be done at several times the minimum appropriate Nyquist frequency in order to avoid creating a low pass digital filter.

I guess you're talking about static synching of two digitised sources that run at the exact same speed but have an uncertainty of +- 0.5 samples. In that case, upsampling would be sufficient to cope with this IMHO. Which is essentially the same as digitising at a higher sample rate anyways.
The syncing itself could then be done with cross correlation.

Reducing noise using a plurality of recording copies

Reply #10
To do this properly, the digitization needs to be done at several times the minimum appropriate Nyquist frequency in order to avoid creating a low pass digital filter.


Why is that?  Making small adjustments to sampling rate shouldn't require much oversampling unless I misunderstand you.


It is practically impossible to get your sets of samples perfectly aligned when you digitize them, so you need to be able to shift them around in very small time increments.

Taking the worst case, if the sets of samples are misaligned by 1/2  a sample period, when you sum the two sets of data you end up unintentionally creating a first order low pass filter with about 3 dB loss at Nyquist, if memory serves.  Since its a first order filter, there is still something like a 10% error at half Nyquist which is pretty deep into the pass band. It can be audible.

The obvious solution is to sample at a higher rate or upsample, which allows you to push this effect way out of the original pass band.  The last time I played this game I found that 20x upsampling met my needs.  This is all fine and good except that I then ended up working with very large sets of data which made everything far more awkward and time consuming.

Reducing noise using a plurality of recording copies

Reply #11
In order to make this work you must have:

1) exactly the same sampling instants to the intended bit depth
2) exactly the same transfer function to all digitizers.

Good like with that.
-----
J. D. (jj) Johnston

Reducing noise using a plurality of recording copies

Reply #12
It is practically impossible to get your sets of samples perfectly aligned when you digitize them, so you need to be able to shift them around in very small time increments.


You don't need higher sampling for that.  When I calibrate my digitizers, I often do 1/20th of sample shift to time align data and to correct for clock drift. 

The obvious solution is to sample at a higher rate or upsample, which allows you to push this effect way out of the original pass band.  The last time I played this game I found that 20x upsampling met my needs.  This is all fine and good except that I then ended up working with very large sets of data which made everything far more awkward and time consuming.


The obvious solution is to just use an algorithm that can apply subsample shifts.  You don't need to actually have " several times the minimum appropriate Nyquist frequency" though.  Just need to be Nyquist sampled.  Better actually doesn't help you much.

 

Reducing noise using a plurality of recording copies

Reply #13
@bryant: Thanks for the pointer to previous discussion, I missed that when doing my initial search.

Thanks for the links to the other papers; I had not seen them before and will read them with interest.

As I mentioned in the referenced thread, I implemented this technique in 2007 but never finished it to the point where it was reliable or polished enough for a product, but the results were very promising. I have uploaded some demo file clips to my Amazon Cloud drive (all of the processing is done with 32-bit float data, but I have converted them to 16-bit WavPack files to make it easier to transfer).

Here are the two original vinyl rips (from different copies of the same LP):

Girl1
Girl2

First, the analysis program scans the two audio files and determines the exact offset between them over the entire clip. Here is a screen capture showing the offset between the files with the Y-axis being samples of offset (I store the offset as 32-bit floats in another WavPack file to visualize it easily):

skew graph

Note that at 11 spots early in the clips they are exactly aligned. You can verify this by loading the two original clips into an audio editor and mix-pasting them together with one inverted. If you listen to that
section you can hear the 11 points where the audio cancels (you can zoom up on the one at 10.38 seconds, for example, and see it also). Here is that file:

GirlDiffOrig

Once that offset information is determined, each original file is temporally “warped” to the midpoint of the two files. After that, if the two files are mix-pasted with invert again, the desired audio mostly cancels during the entire clip. That file is here:

GirlDiff

Finally, another program merges the two aligned files into an averaged version:

GoHomeGirl

The averaging reduces uncorrelated noise by 3 dB and there’s a great place to hear this. At 30.5 seconds there’s a short burst of noise in Girl2 which is not in Girl1 (I recommend playing all 3 files from 29 to
32 seconds). In the final combined result, this noise is reduced by 3 dB (it’s still audible, but less so).

Of course, just averaging the two versions would reduce the amplitude of clicks by 3 dB also, but you would also have twice as many of them! So, another process in the merging is to detect transients that occur in only one file, and then ignore that file during the transient (relying only on the other file). Note that no manual editing of clicks was done for this example. I was tweaking global parameters to get good results (writing the program, actually), but the final merge was completely automatic.

Reducing noise using a plurality of recording copies

Reply #14
Wow! Thank you for sharing those David. The synchronisation is quite amazing.

Have you tried this with 78s? If I can find some (might take months), would you like some samples?

Can you run this process on more than two copies? I assume you could process four copies as two sets of two, and then treat the two outputs from those as another two sources, without even changing the code? Or does the code already handle 3, 4, ...?

Cheers,
David.

Reducing noise using a plurality of recording copies

Reply #15
Of course, just averaging the two versions would reduce the amplitude of clicks by 3 dB also, but you would also have twice as many of them! So, another process in the merging is to detect transients that occur in only one file, and then ignore that file during the transient (relying only on the other file). Note that no manual editing of clicks was done for this example. I was tweaking global parameters to get good results (writing the program, actually), but the final merge was completely automatic.


Very impressive!

I, too, would be very interested to hear it applied to 78s. If you're interested, I could try to get some high quality 78rpm transfers for testing.

I do understand that under ideal circumstances, you would use the exact same equipment and settings. Still, I'm curious as to how robust this approach might be.  Would you be able to get anything useful out of comparing transfers made on different equipment, possibly with slightly different EQ? I have several examples of more or less direct transfers issued to CD that could possibly be used for testing. If the requirement to use the exact same setup could be relaxed, it would be much easier to apply on a larger scale, as I presume collections with multiple copies of each disc is rare.

I hope you will continue working on this project, thanks for sharing your results so far!

Reducing noise using a plurality of recording copies

Reply #16
I presume collections with multiple copies of each disc is rare.
When I first started collecting 78s, I soon had six copies of Eddie Calvert's Oh Mein Papa, as several of my Dad's friends gave me their collections, and everyone had bought that disc in the 1950s. I didn't keep them all!

It might be possible to match the different EQs and maybe even phase differences from different transfers - it's a linear transform after all. I suspect few CD transfers are really "raw", and a major problem is that so many transfers of old recordings are only available in lossy formats (e.g. see the number on YouTube!). I'm guessing lossy processing dramatically impairs this process?

Cheers,
David.

Reducing noise using a plurality of recording copies

Reply #17
It might be possible to match the different EQs and maybe even phase differences from different transfers - it's a linear transform after all. I suspect few CD transfers are really "raw", and a major problem is that so many transfers of old recordings are only available in lossy formats (e.g. see the number on YouTube!). I'm guessing lossy processing dramatically impairs this process?


Good point. My experience is that CD transfers in general are awful, however there are some exceptions with very high quality and little or no processing that are issued by collectors, and not by the recording companies. It was mainly transfers in this category that I had in mind, but if lower quality transfers could also aid in the process, all the better.

Reducing noise using a plurality of recording copies

Reply #18
I don't think already declicked or denoised versions as inputs would help at all - you risk getting the worst of both worlds. The benefit of Bryant's approach is that you always have a real signal to work with, never just damage/interpolation.

Even fairly straight reissues are normally low pass filtered.

Exciting prospect though.

Cheers,
David.

Reducing noise using a plurality of recording copies

Reply #19
David and verdemar, thanks for taking the time to listen to my demo!

I'll try to answer your questions, but first let me emphasize that the program is basically at the “proof of concept” stage (or at least is was in 2007 when I last looked at it). I had convinced myself that it could work, but the last version was very fragile and was just as likely to miserably fail the alignment and end up in the weeds as to work correctly. I was trying to find samples that it actually worked on rather than throw difficult samples at it! So, no, I didn't try 78's or samples with varying EQs or lossy compression, but I'm pretty sure they would have failed (at this point, it won't even tolerate level differences).

That said, I had ideas on why the failures occurred and what kind of changes I could make to improve things, but I simply ran out of time to dedicate to it (and it had reached that point in the development of some projects where it was time to take what had been learned and start over). I think it should be obvious that if the technique can work with one 90 second track, then it's a solvable problem and my goal is to pick it up again later this year.

David, I have definitely thought about the situation of having more than two copies. Like you say, powers of two are easy with the existing structure, but I think it would be more useful for the final compositing to have all the various aligned inputs at once, and the existing structure would not be able to take advantage of three or five copies very well at all (unless, perhaps, some were better than others). And as I mentioned in the other thread, there may be situations in which just using multiple recordings of the same copy might be beneficial.

I have also thought about other possible applications beyond pure restoration. Let's say you had a CD and a vinyl recording of the same title but the CD, while quieter and cleaner, had been made unlistenable with over-zealous dynamic range compression. Well, once the recordings were sample aligned, it would be pretty trivial to use the level of one version to modify the gain of the other. Of course, the alignment would have to be a lot more robust than it is now!

Thanks again, and I'll definitely hit you guys up for those 78 samples once I get further along... 

Reducing noise using a plurality of recording copies

Reply #20
Thank you for replying!

Another potential use for excellent synch is on things like The Beatles catalogue. There are songs where a given sound only exists in the mono or stereo mix (it was added during the mix) and all that exists is the mixes. If you can sync them perfectly, you can probably improve them.

There are many multitracks where one 4-track tape holds one set of sounds, this was then mixed down with an overdub onto track 1 of a second 4-track tape, which then had more sounds added on the other 3 tracks. Stereo mixes from that second 4-track are usually primitive and lop-sided, but if you can synch it properly to the first 4-track, and then subtract a mix of that 4-track from track 1 of the second 4-track, you get each original track and the backed-in overdub separately. You can do a great stereo remix then. They've done this manually (most notably with Eleanor Rigby) where the mixdown from the first 4-track to track 1 of a second 4-track was a straight mix (no overdub added during mixing, so no need to recover them using perfect synch and subtraction) - but a) they got the sync wrong, and b) it doesn't work if there were overdubs during the copying.


The only time I played with synchronisation I used simple block-based cross-correlation. When it picked the right peak, it worked fine, but often as not it didn't. It needed a lot more refinement, and I've never had a work-based project that justified it, so it never got anywhere. I don't have enough "free time" for such things!

Cheers,
David.

Reducing noise using a plurality of recording copies

Reply #21
The obvious solution is to just use an algorithm that can apply subsample shifts.  You don't need to actually have " several times the minimum appropriate Nyquist frequency" though.  Just need to be Nyquist sampled.  Better actually doesn't help you much.


And I can download a freeware/open source subsample shifting plug-in where?  ;-)

Some relevant discussion here:

http://www.gearslutz.com/board/mastering-f...n-1-sample.html

Viable options seem to be pay-for.

Reducing noise using a plurality of recording copies

Reply #22
And I can download a freeware/open source subsample shifting plug-in where?  ;-)
Place a subsample shifted sinc waveform (generate it in some maths package like Octave) into fb2k's convolution plug-in (other free convolution methods are available).

Full upsampling is just a way of getting access to the sinc interpolation in a commonly implemented manner, but it's hopelessly inefficient for the job at hand, because a) most of the new samples are not needed, and so do not need to be calculated, and b) the subsequent downsampling will probably impose unnecessary band-limiting (throwing away samples is all that's needed, if you use some implementation that insists on generating them in the first place).

Cheers,
David.

EDIT: sub-sample delays attached.

20000+0.1 sample delay:
[attachment=7466:sync40k_0.1.wav]

20000+0.9 sample delay:
[attachment=7467:sync40k_0.9.wav]

Both are 32-bit 44.1kHz mono wave files (though the sample rate is irrelevant), 40001 samples long. No windowing, accurate to something like 20-bits (can be made arbitrarily accurate with suitable length and windowing).

Convolve some audio with one or the other - it should sound fine. Convolve some audio with one then the other, remove the 40001 sample delay, subtract the original, and you'll get silence (depending on the accuracy of the convolution calculation; the residual is -85dB in Cool Edit Pro with 16-bit files), proving that 20000+0.1+20000+0.9 sample delay - 40001 sample delay = transparent.

trivial to generate...
Code: [Select]
x=-20000:20000;
d=0.1;
w=sin(pi.*(x-d))./(pi.*(x-d));
%w=w.*hanning(length(w))'; % uncomment to window
wavwrite(w,44100,32,'sync40 0.1.wav')


Note: Cool Edit Pro's own resampling uses about 320 samples at 256 quality, and 1000 at 999 quality (for 16-bit signals), and about 1.5x that for 32-bit signals. Using 40001 samples in this example is overkill  but meant there was absolutely no need to window. Most people would use multiplication in the frequency domain, rather than convolution in the time domain, to speed the whole thing up considerably.

Reducing noise using a plurality of recording copies

Reply #23
Sprechmann et al. use the multi-stencil fast marching method to solve the Eikonal equation in order to determine the time warp, and it seems an open source implementation of that method is available in matlab and c versions under a BSD license here:

http://www.mathworks.com/matlabcentral/fil...e-fast-marching

Reducing noise using a plurality of recording copies

Reply #24
And I can download a freeware/open source subsample shifting plug-in where?  ;-)
I guess us Mac users are lucky with this free AU plugin for milli-sample accurate delay:
Quote
SampleDelay, the only sample delay with negative delay, allowing you to nudge things slightly ahead of the beat, as well as behind.
[/size]http://www.airwindows.com/freebies.html