Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Producing two different audio files from an original one? (Read 9559 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Producing two different audio files from an original one?

I would like to know if there was a way to generate audio files that sound exactly the same and yet are different from the waveform perspective (i.e. different individual byte values at any given time).

I can think of two ways to achieve that:

1) sinc/FFT interpolation + sampling offset or
2) random phase changes.

The problems with these techniques are:

1) It does not play nice with ABX tests.
2) Usually people are not really sensitive to phase changes, but "not really" might be enough in an ABX test... The worst case scenario would be random destructive and constructive patterns in stereo setups that can be spotted right away or even the "chirp" pathological case.

So, do you know any reference discussing this matter? Can you recommend any other way to achive that?

Producing two different audio files from an original one?

Reply #1
Use any reasonable quality lossy codec ever.

Producing two different audio files from an original one?

Reply #2
Destructive interference in an ABX test? How?

Just adding dither will change the "bytes", or adding high frequency content outside the audible range, or some tiny phase shift across the audio band, or some tiny magnitude changes, or like saratoga suggested: convert using a lossy codec with high bitrate.
"I hear it when I see it."

Producing two different audio files from an original one?

Reply #3
Destructive interference in an ABX test? How?


If you screw up with phase alteration and you are in a speaker-based stereo setup, that could happen. I am not saying it would, just that it is a possibility but I don't want to introduce yet another bias. That said, if the phases of the two channels undergo the exact same change, that should not be a problem.

The problem with what you two suggest (which are good ideas) is the same as with what I suggested: you'll always find people saying that the resulting ABX tests are flawed because A, B and X are basically different. If there is a widely accepted way to make sure that the resulting files would sound exactly the same while being different, then that would be great. But if there is any suspicion that there is even the slightest difference in principle (as with using very high bitrate lossy codecs, adding random noise > 20 kHz, dithering the intensity, etc.), then there is no proof that I don't introduce a bias. At the top of that, I want to do it to prevent people from cheating*, not to prevent hashing functions from working.

*: When I say "cheating", I am not saying that people do that on purpose, just like with extra delays or clicks in poorly implemented ABX SW/HW: the brain takes it as an hint. If people can see the waveform, it is often easy to tell which is which (that doesn't prevent density spectrum-based cheating techniques from working, but well, I also have an idea for that).

Producing two different audio files from an original one?

Reply #4
I would like to know if there was a way to generate audio files that sound exactly the same and yet are different from the waveform perspective (i.e. different individual byte values at any given time).


Level and frequency response changes that are less than the threshold of audibility.  At the frequency extremes several dB are possible.

In general, phase shift need only be minor to visibly change the wave, but can be relatively huge without being audible.

Sample rate changes as long as the final sample rate is > 40 KHz.

Producing two different audio files from an original one?

Reply #5
Level and frequency response changes that are less than the threshold of audibility.  At the frequency extremes several dB are possible.

Oh, that's right. If the thresholds are well defined, that could work after all... And it did not occur to me that I could also ABX-test that. Sorry xnor, you also suggested that, but I didn't see that that way.

In general, phase shift need only be minor to visibly change the wave, but can be relatively huge without being audible.

That's true, but what I don't know is whether some work has been done on that. There might also be some well known phase shift patterns that I don't know that would guaranty minimal perceived sound alteration.

Sample rate changes as long as the final sample rate is > 40 KHz.

Well, ABX testing files with different sampling rates is tricky to begin with. When crawling the forum, I've already seen a lot of instances of foobar2000 giving away X because of some glitches.

But well, proper phase alteration + dithering should be alright after all.

Thanks

Producing two different audio files from an original one?

Reply #6
In general, phase shift need only be minor to visibly change the wave, but can be relatively huge without being audible.

That's true, but what I don't know is whether some work has been done on that. There might also be some well known phase shift patterns that I don't know that would guaranty minimal perceived sound alteration.

Sample rate changes as long as the final sample rate is > 40 KHz.

Well, ABX testing files with different sampling rates is tricky to begin with. When crawling the forum, I've already seen a lot of instances of foobar2000 giving away X because of some glitches.

But well, proper phase alteration + dithering should be alright after all.


I don't know why you need dithering as part of applying phase shift as long as you do the phase shift without reducing bit accuracy.

Our ABX tests have shown over the years that as long as the phase shift is applied without artifacts identically to both channels, more than a thousand degrees can generally escape detection.

So if you limit yourself to a few 100 degrees, its gonna be OK.

Producing two different audio files from an original one?

Reply #7
Something else you might want to consider is all-pass filters.

Producing two different audio files from an original one?

Reply #8
Something else you might want to consider is all-pass filters.


All pass filters are not "something else" than the phase shift that has already been recommended.  It is the same thing.

Producing two different audio files from an original one?

Reply #9
I don't know why you need dithering as part of applying phase shift as long as you do the phase shift without reducing bit accuracy.

Our ABX tests have shown over the years that as long as the phase shift is applied without artifacts identically to both channels, more than a thousand degrees can generally escape detection.

So if you limit yourself to a few 100 degrees, its gonna be OK.


I didn't mean that I needed dithering at the top of phase shifting. Just that if I need even more changes, I have two ranges of techniques that I can use.

But I don't really understand your comment about phase shifting: AFAIK this is circular (modulo 360° or 2pi rad). But anyway, shifting by 100° or less should be easy enough and provide the effect that I want.

Producing two different audio files from an original one?

Reply #10
All pass filters are not "something else" than the phase shift that has already been recommended.  It is the same thing.


True, yet this is a great keyword. "Phase shifting" is OK, but I'm entering a new world with "all-pass filter".

As well, I see why you are talking about non-circular degrees now. From the DFT perspective, this does not make sense but from a macroscopic perspective, that's easier.

Producing two different audio files from an original one?

Reply #11
Plot phase vs. frequency. For a first order IIR all pass the final phase shift would be 180°, 90° at the cutoff frequency. For a second order one it would be 360°.
"I hear it when I see it."

Producing two different audio files from an original one?

Reply #12
Quote
But I don't really understand your comment about phase shifting: AFAIK this is circular (modulo 360° or 2pi rad).
Only with continuous tones.  Real program material does not repeat every 360 degrees.

Quote
I would like to know if there was a way to generate audio files that sound exactly the same and yet are different from the waveform perspective (i.e. different individual byte values at any given time).
- Flip the polarity (invert).

- Delay by 1mS or so.

- Cut the level by 0.1dB or so.

Producing two different audio files from an original one?

Reply #13
Only with continuous tones.  Real program material does not repeat every 360 degrees.


Well, from a macro perspective, yes: if you shift your phase by 4pi, you shift it by 4pi and the phase vs. frequency representation is best seen like that. No argument there.

From a micro/DFT perspective, it repeats: each frequency component corresponds to an (infinite) plane wave. If you shift the phase of that particular plane wave by 2pi, then it is exactly as if you did not shift it. So if you shift all the plane waves by k*2pi (k being an integer), then you get exactly the same signal, even for a real signal. Said otherwise, if you invert your signal (180° phase shift) and you do it again (180° phase shift, *not* -180°), you get back the original signal.

But I might be wrong, if that's the case, feel free to explain me why: I'm keen on learning.

Producing two different audio files from an original one?

Reply #14
Only with continuous tones.  Real program material does not repeat every 360 degrees.


Well, from a macro perspective, yes: if you shift your phase by 4pi, you shift it by 4pi and the phase vs. frequency representation is best seen like that. No argument there.

From a micro/DFT perspective, it repeats: each frequency component corresponds to an (infinite) plane wave.


The DFT decomposes a signal onto a set of periodic basis functions that repeat every 360 degrees, which is what you are referring to above.  What other people in this thread is discussing is applying a frequency dependent phase delay to a waveform (filtering).  The delay in filters is not periodic.  It is common for real world systems to have delay dispersions on the other of thousands or millions of degrees per octave.

Producing two different audio files from an original one?

Reply #15
Yes, this is something that I don't get then. Is there any reference that you recommend that I read?

I think I'm not using the right keywords to find documentation on this. I really don't see the difference between the DFT-based frequency dependent phase delay and the frequency dependent phase delay to a waveform (but again, that doesn't mean there isn't, just that I don't get it).

From my (limited) experience, if I want to correct the phase and I do it using a 1st order polynomial (for instance), if this polynomial is truly "continuous" or is "discontinuous" (wrapped every 2pi), I get the same result (in this case, a global shift of my signal). That is coherent with what I understood until then (DFT framework).

Producing two different audio files from an original one?

Reply #16
You don't need any DFT or fancy stuff.

Just (Matlab code):
Code: [Select]
b = [-8.667884394996352e-01 1];
a = [1 -8.667884394996352e-01];

output = filter(b, a, input);


That's a -180° phase shift allpass, cutoff frequency of 1000 Hz given a 44100 Hz sampling rate.

You can scale b to add gain, negate it to also reverse polarity... it's a simple IIR filter.
"I hear it when I see it."

Producing two different audio files from an original one?

Reply #17
Now about DFT. The angle of the DFT bins naturally will be in the range +/- pi. You can however unwrap (see Matlab function) the phase.
"I hear it when I see it."

Producing two different audio files from an original one?

Reply #18
I think I'm not using the right keywords to find documentation on this. I really don't see the difference between the DFT-based frequency dependent phase delay and the frequency dependent phase delay to a waveform (but again, that doesn't mean there isn't, just that I don't get it).


The phase on the DFT has units of seconds, whereas the frequency dependent phase delay of a filter (delay per Hz, also known as dispersion) has units of seconds squared.  The former is the delay of an individual frequency (sinusoid) in units of time (seconds, etc).  The latter is the rate of change (derivative) of phase delay with frequency.  Even with phase wrapping, the rate of change of delay per unit of frequency can be very large, particularly if frequency bins are closely spaced in frequency. 

Completely trivial example:  if your DFT bins are 0.1 Hz apart, and you have 60 degrees of additional delay per bin, you will have 600 degrees per Hz delay, which will in fact result in a very differently shaped waveform than (600-360=) 240 degrees/Hz. 


Producing two different audio files from an original one?

Reply #19
Oh I see, that's perfectly clear now. Thank you.

Producing two different audio files from an original one?

Reply #20
In the end, after testing the few techniques mentioned here, I will be using constant phase shifting which does the job surprisingly well to get different wave forms (I might actually want proper all-pass filter, but that will depend on the listening test results that I find).

I'll also use frequency dithering to modify the shape of the power spectrum a little bit.

I still have to determine suitable and conservative transparency limits for these. I'm gonna look at what has already been done in this area.

Thanks a lot everybody!

Producing two different audio files from an original one?

Reply #21
You mean linear phase shift? What filter are you using?
"I hear it when I see it."

Producing two different audio files from an original one?

Reply #22
Nope, I'm using FFT for the moment. I just multiplied the FFTed signal by exp(-1j*angle).

Linear phase shift would shift the whole signal altogether, so not that useful...

Producing two different audio files from an original one?

Reply #23
I don't think that's a good way to do it. There will be aliasing, and check the impulse/step response of what you're doing..
"I hear it when I see it."

Producing two different audio files from an original one?

Reply #24
Why would there be aliasing? I am not changing the sampling rate, my sample is already zero-padded and I'm actually convolving with a Dirac peak of unit gain. Seems wrong on that: I am convolving with my impulse response which is actually very localized (a few samples).

But what do you suggest instead? I'm interested in everything I can learn about this.

Indeed, the impulse and step response are not great. But at the same time, if they were, I wouldn't see any difference in the waveform.