IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
What Kind Of Music For Testing?
Native_Soulja
post Jul 13 2010, 09:46
Post #1





Group: Members
Posts: 25
Joined: 7-July 10
Member No.: 82108



What kind of music would be good for testing for like a mp3 and other codec comparisons? Im going to do a blind test with some of friends and family to get their opinons on which codec and bitrates sounds better and need some good examples.....thanks in head of time

This post has been edited by Native_Soulja: Jul 13 2010, 09:47
Go to the top of the page
+Quote Post
pdq
post Jul 13 2010, 13:50
Post #2





Group: Members
Posts: 3374
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



I would recommend testing with whatever kind of music you typically listen to and are most familiar with. Examples of music that certain codecs have problems with are usually specific to the codec and are not typical in any case.

Also, if you have in mind testing at fairly high bit rates then your test subjects will probably not be able to hear differences. Start with relatively low bit rates and then work your way up.

Be sure to research ABX testing so that you do proper double-blind testing.
Go to the top of the page
+Quote Post
Meeko
post Jul 13 2010, 13:54
Post #3





Group: Members
Posts: 82
Joined: 24-December 09
From: New York
Member No.: 76308



Use whatever you like and listen to regularly, this way if something stands out, you'll know it. If you plan on testing mp3 files and things like that it would probably be best to start low at -V7 and go up to -V0 because you may find yourself getting stuck at -V5 (approx 130kbps). Mp3 with the Lame encoder has gotten that good!

Good luck!


--------------------
foobar2000, FLAC, and qAAC -V68
It just works people!
Go to the top of the page
+Quote Post
C.R.Helmrich
post Jul 13 2010, 19:35
Post #4





Group: Developer
Posts: 686
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



And, Native_Soulja, if you don't want to test with music you listen to regularly, but with a collection of recordings which are known to challenge most modern codecs, selected from about 100 "critical recordings" after months of listening tests, take a look at this post and the ones below:

http://www.hydrogenaudio.org/forums/index....st&p=695576

This is intended for an upcoming AAC test but should apply just as well to codecs like Vorbis, WMA, and MP3.

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 14 2010, 10:50
Post #5





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (Native_Soulja @ Jul 13 2010, 04:46) *
What kind of music would be good for testing for like a mp3 and other codec comparisons? Im going to do a blind test with some of friends and family to get their opinons on which codec and bitrates sounds better and need some good examples.....thanks in head of time



I agree with Chris Helmrich - check the outcomes of past subjective tests of a similar nature and include some of the musical selections that are already known to produce positive results.

The choice of program material is probably the most important variable in most subjective tests.

In equipment tests it is possible to make educated guesses as to what program material will be the most diagnostic. For example, if you know that the amplifiers being tested vary in terms of high frequency performance, program material with more than usual amounts of high frequency information may be a big help. It's pretty sure that using program material that lacks highs will probably bias the test towards null results.

In the case of perceptual coders, it seems that music with rapdily changing content can be more challenging to code. Codec designers may reveal the kinds of music that they have used as guides in their development efforts. There has been pretty good public documentation of the musical selections used in some tests that were done by the MPEG group.
Go to the top of the page
+Quote Post
southisup
post Jul 14 2010, 11:29
Post #6





Group: Members
Posts: 251
Joined: 28-October 05
Member No.: 25414



QUOTE (Native_Soulja @ Jul 13 2010, 18:46) *
..to get their opinons on which codec and bitrates sounds better

You'll have to use either "killer samples", known to cause problems, or quite low bit rates - all the major codecs will almost certainly sound completely transparent, at common bit rates, to everyone there.

This post has been edited by southisup: Jul 14 2010, 11:31
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 14 2010, 16:07
Post #7





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (southisup @ Jul 14 2010, 06:29) *
QUOTE (Native_Soulja @ Jul 13 2010, 18:46) *
..to get their opinons on which codec and bitrates sounds better

You'll have to use either "killer samples", known to cause problems, or quite low bit rates - all the major codecs will almost certainly sound completely transparent, at common bit rates, to everyone there.


I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.
Go to the top of the page
+Quote Post
pdq
post Jul 14 2010, 16:29
Post #8





Group: Members
Posts: 3374
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



QUOTE (Arnold B. Krueger @ Jul 14 2010, 11:07) *
QUOTE (southisup @ Jul 14 2010, 06:29) *
QUOTE (Native_Soulja @ Jul 13 2010, 18:46) *
..to get their opinons on which codec and bitrates sounds better

You'll have to use either "killer samples", known to cause problems, or quite low bit rates - all the major codecs will almost certainly sound completely transparent, at common bit rates, to everyone there.


I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.

Absent some theoretical analysis of how repeated encode/decode affects sound quality, I would be a little leary of this. smile.gif
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 14 2010, 17:27
Post #9





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (pdq @ Jul 14 2010, 11:29) *
QUOTE (Arnold B. Krueger @ Jul 14 2010, 11:07) *
QUOTE (southisup @ Jul 14 2010, 06:29) *
QUOTE (Native_Soulja @ Jul 13 2010, 18:46) *
..to get their opinons on which codec and bitrates sounds better

You'll have to use either "killer samples", known to cause problems, or quite low bit rates - all the major codecs will almost certainly sound completely transparent, at common bit rates, to everyone there.


I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.

Absent some theoretical analysis of how repeated encode/decode affects sound quality, I would be a little leary of this. smile.gif


Every methdology that I've seen proposed suffers from the same general potential problem - it is somehow asymmetric from the real world in some way other than just percentage of error.

I'm under the impression that coders make different kinds of mistakes for different target bitrates. So there is a potential asymmetry.

Obviously, different music triggers different artifacts, so there is yet another potential asymmetry.

Coding and decoding music makes at least technical changes to the music, so sucessive encodings may add different amounts or kinds of errors.

Another possibility is to subtract the reconstructed coded file from the original source file to create a difference file, and then add variable amounts of the difference file back in as desired to get enough errors to be readily audible. This approach may be asymmetric because there is no guarantee that coder errors add linearly when judged by the standard of human perception.

In short, nothing is perfect and you do the best you can with the tools at hand.

Being able to create test files with realistic errors at any desired level is highly desirable because it has great potential for listener training.

Go to the top of the page
+Quote Post
C.R.Helmrich
post Jul 15 2010, 10:24
Post #10





Group: Developer
Posts: 686
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (Arnold B. Krueger @ Jul 14 2010, 17:07) *
I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.

I've never tried this in blind tests, so don't know if it has the same effect as turning down the bit rate (in terms of bits per sample) or using highly trained listeners. Essentially, the desired effect would be that all listening grades just move down the scale proportionally and that the ranking between the coders stays the same.

For simplicity, Native Soulja, if you're interested in high bit rates and your friends and family have never done blind listening tests, I recommend a test where all coders run at 96 kb/sec CBR or VBR. There, you have the highest chances of hearing clear quality differences between codecs.

Chris

This post has been edited by C.R.Helmrich: Jul 15 2010, 10:25


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
2Bdecided
post Jul 15 2010, 11:03
Post #11


ReplayGain developer


Group: Developer
Posts: 5060
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Arnold B. Krueger @ Jul 14 2010, 16:07) *
I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.
It'll certainly cause audible problems.

What's not clear is whether the severity of audible problems after, say, 100 encode/decode cycles is in any related to, or correlated with, the number of "problem samples" that reveal audible problems after a single encode/decode, or the severity of the audible problems on those samples.

It could be that the "best" encoder, with zero problem samples at a given bitrate, was found to sound worst after 100 encode/decode cycles.

Cheers,
David.
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 15 2010, 17:39
Post #12





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (2Bdecided @ Jul 15 2010, 06:03) *
QUOTE (Arnold B. Krueger @ Jul 14 2010, 16:07) *
I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.
It'll certainly cause audible problems.

What's not clear is whether the severity of audible problems after, say, 100 encode/decode cycles is in any related to, or correlated with, the number of "problem samples" that reveal audible problems after a single encode/decode, or the severity of the audible problems on those samples.


I would suggest that using 100 encode/decode cycles as your critical point for the suggested approach is more than a little extreme.

What would be a reasonable number of iterations? 5? 10? 20?

When we studied things like this at university back when dinosaurs roamed the earth and vinyl, tubes, and mag tape were all we had; we started out looking at the first few interations.

Let's assume that there is only mild degradation in each iteration.

Then the results of the first iteration is composed of a good copy of the input plus some small error signal:

output = input + f(input). f(input) is the error and could be just about any kind of error we can think of.

The second time through, output = input + f(input) + f(input) + f (f(input)) Since the error is small, we can simplify this to output = input + 2 f(input).

Repeat as needed. The error increases linearly with the number of repetitions.

QUOTE
It could be that the "best" encoder, with zero problem samples at a given bitrate, was found to sound worst after 100 encode/decode cycles.


Can you provide a convincing argument for this assertion?
Go to the top of the page
+Quote Post
pdq
post Jul 15 2010, 17:46
Post #13





Group: Members
Posts: 3374
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



In the analog world what you are saying is very reasonable.

However, in the digital world of perceptual encoding it may be very different. In attempting to encode artefacts from the previous encode does it simply make the same artefacts twice as big? And can you generalize this to all encoders of all codecs?
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 16 2010, 07:53
Post #14





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (pdq @ Jul 15 2010, 12:46) *
In the analog world what you are saying is very reasonable.

However, in the digital world of perceptual encoding it may be very different.


Or not. Your speculations are somehow more reliable than math?

QUOTE
In attempting to encode artefacts from the previous encode does it simply make the same artefacts twice as big? And can you generalize this to all encoders of all codecs?


The math is classic and irrefutable as far as it goes. If you want to attack the hypothesis, you have to refute its assumptions.

Got game at the same level as the hypothesis? ;-)
Go to the top of the page
+Quote Post
lvqcl
post Jul 16 2010, 12:33
Post #15





Group: Developer
Posts: 3336
Joined: 2-December 07
Member No.: 49183



QUOTE
I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.


I tried to do this with LossyWAV (0.wav is original file, 1.wav, 2,wav etc are subsequent generations).

foo_bitcompare shows the number of different samples:

0.wav and 1.wav: Differences found: 16513693 sample(s)
1.wav and 2.wav: Differences found: 1103325 sample(s)
2.wav and 3.wav: Differences found: 27242 sample(s)
3.wav and 4.wav: Differences found: 777 sample(s)
4.wav and 5.wav: No differences in decoded data found.

Oops. wink.gif

This post has been edited by lvqcl: Jul 16 2010, 12:34
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 16 2010, 14:08
Post #16





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (lvqcl @ Jul 16 2010, 07:33) *
QUOTE
I've also used another technique - encode and decode a sample over and over again. This works very well with hardware listening tests.


I tried to do this with LossyWAV (0.wav is original file, 1.wav, 2,wav etc are subsequent generations).

foo_bitcompare shows the number of different samples:

0.wav and 1.wav: Differences found: 16513693 sample(s)
1.wav and 2.wav: Differences found: 1103325 sample(s)
2.wav and 3.wav: Differences found: 27242 sample(s)
3.wav and 4.wav: Differences found: 777 sample(s)
4.wav and 5.wav: No differences in decoded data found.

Oops. wink.gif


That suggests rather strongly that lossyWAV violates the assumption that the number of errors in the initial pass is small.

I would be surprised if 0.wav and 1.wav are difficult to distinguish from each other.

Can you post them?
Go to the top of the page
+Quote Post
pdq
post Jul 16 2010, 14:15
Post #17





Group: Members
Posts: 3374
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



QUOTE (Arnold B. Krueger @ Jul 16 2010, 09:08) *
That suggests rather strongly that lossyWAV violates the assumption that the number of errors in the initial pass is small.

I would be surprised if 0.wav and 1.wav are difficult to distinguish from each other.

Can you post them?

All it suggests is that with very small adjustments in the lowest bits of each 16 bit value the data become much easier to encode losslessly. Why would you think that the difference is necessarily audible?
Go to the top of the page
+Quote Post
lvqcl
post Jul 16 2010, 14:38
Post #18





Group: Developer
Posts: 3336
Joined: 2-December 07
Member No.: 49183



QUOTE (Arnold B. Krueger @ Jul 16 2010, 17:08) *
I would be surprised if 0.wav and 1.wav are difficult to distinguish from each other.

Can you post them?

Sorry but they are too long (about 5 min long).

What about simplified version that just zeroes 2 least significant bits of each sample? Second generation is apparently equal to 1st, etc.
Go to the top of the page
+Quote Post
2Bdecided
post Jul 16 2010, 16:15
Post #19


ReplayGain developer


Group: Developer
Posts: 5060
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Arnold B. Krueger @ Jul 15 2010, 17:39) *
When we studied things like this at university back when dinosaurs roamed the earth...
wink.gif
QUOTE
...and vinyl, tubes, and mag tape were all we had; we started out looking at the first few interations.

Let's assume that there is only mild degradation in each iteration.

Then the results of the first iteration is composed of a good copy of the input plus some small error signal:

output = input + f(input). f(input) is the error and could be just about any kind of error we can think of.

The second time through, output = input + f(input) + f(input) + f (f(input)) Since the error is small, we can simplify this to output = input + 2 f(input).

Repeat as needed. The error increases linearly with the number of repetitions.
You can't do that though - not with psychoacoustic codecs. The process isn't linear, and the error isn't small.

e.g. think about time domain smearing. That just spreads and spreads and spreads. Your arithmetic "output = input + 2 f(input)" doesn't work at all.


btw, I'm not sure it always worked for analogue. You had to apply some common sense even back then.

e.g. It doesn't work for VHS tape. There are lots of dynamic processes in recording and playback. At some point, the sync pulses get lost, and there's no picture at all. Let's say that happens after 10 generations. Your arithmetic says the error after one generation is only 1/10th of this. Yet it also happens after 100 generations. Your arithmetic says the error after one generation is only 1/100th of this.

I'm not sure one generation of VHS gives you 1/10th or 1/100th of total signal loss.

Cheers,
David.
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 16 2010, 16:38
Post #20





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (pdq @ Jul 16 2010, 09:15) *
QUOTE (Arnold B. Krueger @ Jul 16 2010, 09:08) *
That suggests rather strongly that lossyWAV violates the assumption that the number of errors in the initial pass is small.

I would be surprised if 0.wav and 1.wav are difficult to distinguish from each other.

Can you post them?

All it suggests is that with very small adjustments in the lowest bits of each 16 bit value the data become much easier to encode losslessly. Why would you think that the difference is necessarily audible?


It suggests that as well.

It's a pathological example. One could cut to the chase and simply strip off the low order bit. There would be no changes at all in sucessive passes since the bit is already gone and there an be no other changes in sucessive passes.

Knowing that this is the nature of the degradation, an intelligent person would act wisely and prepare the progressively degraded samples by stipping off an *additional* low order bit for each pass. At least, that's what I did when I built the samples for www.pcabx.com .

I'm sure that Forrest Gump had a saying that described this patholgoical example. ;-)

If people rejected every idea for which there was a pathological counter-example, we would probably still be back in dark ages.

Go to the top of the page
+Quote Post
2Bdecided
post Jul 16 2010, 17:11
Post #21


ReplayGain developer


Group: Developer
Posts: 5060
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Arnold B. Krueger @ Jul 16 2010, 16:38) *
If people rejected every idea for which there was a pathological counter-example, we would probably still be back in dark ages.
Funny how the examples that disprove your theories are always "pathological", "straw men", "extreme" etc etc.

If people always held on to theories that were disproved, we would probably still be back in the dark ages.

Cheers,
David.
Go to the top of the page
+Quote Post
Alex B
post Jul 16 2010, 18:21
Post #22





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



Assuming the goal is to test modern encoders using settings that are intended to produce transparent quality, it doesn't make sense to artificially reinforce the inaudible artifacts. The developers try to improve the lossy encoders only to the point where the artifacts become practically inaudible. When the handling of a certain kind of signal is tweaked to be good enough normally the next goal is to fix newly found audible artifacts that occur with different problem samples, not to somehow reduce the already fixed problems even more.

The only sensible way to test nearly transparent encoders is to find a good selection of problem samples that can produce useful test results when the encoders are used in the normal, intended way.


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 16 2010, 18:33
Post #23





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (Alex B @ Jul 16 2010, 13:21) *
Assuming the goal is to test modern encoders using settings that are intended to produce transparent quality, it doesn't make sense to artificially reinforce the inaudible artifacts.


The problem with this approach is that it would seem to presume that any given artifact is either inaudible to everybody or are audible for everybody.

IOW, there are only exactly two pigeonholes and every artifiact goes into one pigeon hole or the other, and goes into the same pigeon hole for everybody.

Reality is that our ability to hear artifacts varies from person to person, and also varies for the same person at different times.

Our highest goal would probably be that no artifact would be heard by anybody at any time.

How do we test encoders so that both the bitrate and the number of artifacts are minimized?

How do we test a coder with a finite number of tests that take a reasonable amount of time, and also have a very high probability of an coder that never creates an audible artifact.

With amplifiers, we can say that if all nonlinear distoriton and other artifacts are 100 dB down, then nobody can ever hear any artifacts. We then build amplifiers with all artifacts 110 dB down.

How do you do the same thing for encoders, and do so with any confidence?
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 16 2010, 18:50
Post #24





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (lvqcl @ Jul 16 2010, 09:38) *
QUOTE (Arnold B. Krueger @ Jul 16 2010, 17:08) *
I would be surprised if 0.wav and 1.wav are difficult to distinguish from each other.

Can you post them?

Sorry but they are too long (about 5 min long).

What about simplified version that just zeroes 2 least significant bits of each sample? Second generation is apparently equal to 1st, etc.


Oh, you want to change the basic nature of the examples?

I don't see what that would show other that that you need to make basic changes to your examples in order to demonstrate your point.

What about posting shorter but useful subsets of the 5 minute selections?

One other comment - the criteria that you used - any change in any part of any sample seems to be far more nonlinear than hearing. Your example probably fails to apply because of that.
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 16 2010, 19:36
Post #25





Group: Members
Posts: 3649
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (pdq @ Jul 15 2010, 12:46) *
In attempting to encode artefacts from the previous encode does it simply make the same artefacts twice as big?



I'm presuming that it does not make artifacts twice as big. I'm assuming that it makes the same audible artefacts as it did the first time, and they in some sense add.

QUOTE
And can you generalize this to all encoders of all codecs?


*All* is a very big word, so the answer has to be "maybe".

This was intended to be one of those things that you try and see what happens.

Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 29th July 2014 - 10:46