IPB

Welcome Guest ( Log In | Register )

4 Pages V  < 1 2 3 4 >  
Reply to this topicStart new topic
True FLAC vs. Fake FLAC
greynol
post Oct 12 2011, 05:22
Post #51





Group: Super Moderator
Posts: 10338
Joined: 1-April 04
From: San Francisco
Member No.: 13167



I've seen examples of both false positives and false negatives.


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
saratoga
post Oct 12 2011, 05:44
Post #52





Group: Members
Posts: 5156
Joined: 2-September 02
Member No.: 3264



QUOTE (Joseph93 @ Oct 12 2011, 00:13) *
3 things:

1. hi I am new

2. I recently stumbled across a paper which details an algorithm that, with a very high success rate, guess the bit rate of an audio file just using data from the file's high-frequency spectrum. If developed further it could remove the need to visually inspect a spectrogram etc. and would be much faster.

http://www.fileden.com/files/2009/2/14/232...20Frequency.pdf


QUOTE
In order to obtain the feature data, the source MP3 files were each
decompressed into a 1411 kbps WAV file using the Fraunhofer
IIS MP3 Surround Commandline Decoder V1.4 [2]. This was
done because audio files in this format can easily be read into
MATLAB, and as we have demonstrated, transcoding to a higher
bit rate does not affect the frequency characteristics of the audio
which we are observing.


It is, in my opinion, not a good sign when the author of a paper does not understand that WAV is a lossless format and so resorts to arguing that "transcoding" to PCM probably doesn't change the audio. Regardless, all that paper demonstrates is that if you know that LAME 3.97 was used with default lowpass for each bitrate, you can figure out the source bitrate by looking at the lowpass setting.

Go to the top of the page
+Quote Post
Joseph93
post Oct 12 2011, 06:19
Post #53





Group: Members
Posts: 4
Joined: 12-October 11
Member No.: 94299



QUOTE (saratoga @ Oct 12 2011, 05:44) *
It is, in my opinion, not a good sign when the author of a paper does not understand that WAV is a lossless format and so resorts to arguing that "transcoding" to PCM probably doesn't change the audio. Regardless, all that paper demonstrates is that if you know that LAME 3.97 was used with default lowpass for each bitrate, you can figure out the source bitrate by looking at the lowpass setting.

So does the paper actually not do what it claims it does? I don't see any reliance on prior knowledge concerning the "history" of the file in question.
Go to the top of the page
+Quote Post
saratoga
post Oct 12 2011, 06:49
Post #54





Group: Members
Posts: 5156
Joined: 2-September 02
Member No.: 3264



QUOTE (Joseph93 @ Oct 12 2011, 01:19) *
So does the paper actually not do what it claims it does?


It does what they claim, take a known encoder and version and then determine what bitrate was used. It doesn't do what people in this thread are interested in though.

QUOTE (Joseph93 @ Oct 12 2011, 01:19) *
I don't see any reliance on prior knowledge concerning the "history" of the file in question.


Suggest reading section 2, "procedure". They train their model using the same encoder and settings they will then attempt to detect. Without this the system is useless.
Go to the top of the page
+Quote Post
Joseph93
post Oct 12 2011, 07:01
Post #55





Group: Members
Posts: 4
Joined: 12-October 11
Member No.: 94299



QUOTE
It doesn't do what people in this thread are interested in though.


once you have an algorithm which can estimate original bit rates of a transcoded lossless format, getting a yes/no answer to the question "are my flacs 'real'?" seems trivial.


QUOTE
Suggest reading section 2, "procedure". They train their model using the same encoder and settings they will then attempt to detect. Without this the system is useless.

Correct.

I still don't see any reliance on prior knowledge concerning the "history" of the file in question. Of course any algorithm of this nature needs information about what it is looking for!
Go to the top of the page
+Quote Post
saratoga
post Oct 12 2011, 07:09
Post #56





Group: Members
Posts: 5156
Joined: 2-September 02
Member No.: 3264



QUOTE (Joseph93 @ Oct 12 2011, 02:01) *
QUOTE
It doesn't do what people in this thread are interested in though.


once you have an algorithm which can estimate original bit rates of a transcoded lossless format, getting a yes/no answer to the question "are my flacs 'real'?" seems trivial.


If you know that the file was encoded with a given lame version, then you already know the answer to the question "are my flacs that I've created from my LAME mp3s 'real'" is "No".

That said you are correct that determining if the output of a given LAME version will be lossy is quite trivial.

QUOTE (Joseph93 @ Oct 12 2011, 02:01) *
QUOTE
Suggest reading section 2, "procedure". They train their model using the same encoder and settings they will then attempt to detect. Without this the system is useless.

Correct.

I still don't see any reliance on prior knowledge concerning the "history" of the file in question. Of course any algorithm of this nature needs information about what it is looking for!


As I said above, the prior knowledge is the encoder and settings (aside from bitrate) used to create the file in question.
Go to the top of the page
+Quote Post
Joseph93
post Oct 12 2011, 07:28
Post #57





Group: Members
Posts: 4
Joined: 12-October 11
Member No.: 94299



QUOTE
If you know that the file was encoded with a given lame version, then you already know the answer to the question "are my flacs that I've created from my LAME mp3s 'real'" is "No".


Yes but surely the OP was referring to a situation where the file history is unknown. (Suggest reading OP.) Even in this case the algorithm should be able to take arbitrary WAVs and, if they are indeed transcodes, guess their original bitrate with a great deal of accuracy.


QUOTE
As I said above, the prior knowledge is the encoder and settings (aside from bitrate) used to create the file in question.

No. The prior knowledge is the information about the frequency characteristics of lame-encoded mp3s. What I am saying is that once you have trained the algorithm, you can then take arbitrary WAVs (or flacs, or whatever) and use the algorithm on them. This is pretty standard. Train an algorithm with a set of inputs, then give it arbitrary inputs and see how it does. 100% accuracy cannot be expected.
Go to the top of the page
+Quote Post
saratoga
post Oct 12 2011, 07:41
Post #58





Group: Members
Posts: 5156
Joined: 2-September 02
Member No.: 3264



QUOTE (Joseph93 @ Oct 12 2011, 02:28) *
Yes but surely the OP was referring to a situation where the file history is unknown.


In this case you cannot use this software. The authors have demonstrated identification of files that are known a priori to be transcoded by incorporating that knowledge into their algorithm. Hence my point above that its not useful for what people in this thread want to do.

QUOTE (Joseph93 @ Oct 12 2011, 02:28) *
(Suggest reading OP.)


No need to get angry at me. I'm not attacking you, I'm just trying to lead you towards an understanding of why what you are proposing does not work.

QUOTE (Joseph93 @ Oct 12 2011, 02:28) *
Even in this case the algorithm should be able to take arbitrary WAVs


It should? The authors certainly haven't demonstrated that. In fact they are quite clear that they have not chosen arbitrary wav files.

QUOTE (Joseph93 @ Oct 12 2011, 02:28) *
QUOTE
As I said above, the prior knowledge is the encoder and settings (aside from bitrate) used to create the file in question.

No. The prior knowledge is the information about the frequency characteristics of lame-encoded mp3s.


I'm not being condescending, but read more carefully, there is a LOT more prior information being used here. The procedure explains that the training set and the unknown set were encoded with identical settings and encoder. This is not by chance. The authors have not accidentally made their problem extremely easy compared to the one you want to solve.

QUOTE (Joseph93 @ Oct 12 2011, 02:28) *
What I am saying is that once you have trained the algorithm, you can then take arbitrary WAVs (or flacs, or whatever) and use the algorithm on them.


Ignoring for a moment what is actually going on, if this actually worked, why do you think the authors decided not to show that this was possible? Perhaps they were concerned about making their paper too exciting wink.gif
Go to the top of the page
+Quote Post
r0k
post Oct 12 2011, 12:37
Post #59





Group: Members
Posts: 74
Joined: 8-September 11
Member No.: 93574



QUOTE (greynol @ Oct 12 2011, 06:22) *
I've seen examples of both false positives and false negatives.

I guess when a full CD is identified as CDDA it's safe to consider it an original and not an MP3 reconstruct. This might be the reason why Tau Analyser only works for a full CD and not an individual file.

OK, i'll go a bit off topic for the last time (i hope wink.gif ). I have receive the answer from Qobuz. They have checked the file and think at has been through some MPA compression. They will ask the producer for a true original and offered me a free album to compensate. They were pretty quick to react too. Good point for them smile.gif
Go to the top of the page
+Quote Post
mjb2006
post Oct 12 2011, 22:03
Post #60





Group: Members
Posts: 860
Joined: 12-May 06
From: Colorado, USA
Member No.: 30694



QUOTE (r0k @ Oct 12 2011, 05:37) *
I have receive the answer from Qobuz. They have checked the file and think at has been through some MPA compression. They will ask the producer for a true original and offered me a free album to compensate. They were pretty quick to react too. Good point for them smile.gif

Many people, even many musicians, simply can't hear the difference between a lossy version and the original, which shouldn't be surprising, given the robustness of the lossy formats and all the listening tests we're familiar with. Some/many are also just not very technologically savvy. It really would not surprise me to find out, then, that artists or their representatives wouldn't necessarily even know that MP3/AAC/whatever is lossy at all, or recognize that once something is lossily encoded, there's no going back, even if they have a converter that turns their MP3s back into the WAVs or AIFFs needed by their labels and distributors. That's one of the reasons transcodes happen in general; people think "if higher bitrates mean higher quality, then I'll just convert this 128 kbps MP3 to a 320 kbps one! or maybe I'll just convert it to WAV and it'll be perfect!"
Go to the top of the page
+Quote Post
testyou
post Oct 13 2011, 09:52
Post #61





Group: Members
Posts: 99
Joined: 24-September 10
Member No.: 84113



QUOTE
Some/many are also just not very technologically savvy.

Unfortunately, this is very true.
I have contacted several artists that I follow on Soundcloud about this.
Knowledge of lossy/lossless encoding is not a prerequisite to creating music, and is sometimes not learned.
Go to the top of the page
+Quote Post
r0k
post Oct 13 2011, 15:31
Post #62





Group: Members
Posts: 74
Joined: 8-September 11
Member No.: 93574



And then, there are people who are not even musicians or technicians involved at some point.
Hopefully, it will change now that compagnies selling FLAC start to appear throughout the web. Unfortunately there is still too few poeple who are well aware of the issue to be careful about what they buy.
1 year ago i was still ripping to MP3 and burning those back on CDs when i wanted to copy a disc! yeahright.gif

It's up to us to spread the word now smile.gif
Go to the top of the page
+Quote Post
knutinh
post Oct 16 2011, 00:20
Post #63





Group: Members
Posts: 570
Joined: 1-November 06
Member No.: 37047



I have noticed that mp3s commonly add a large error to the input signal - when the error is estimated visually by looking at waveforms, not if you are listening to the decoded file (what the format is made for really).

In my simplified understanding, this can be interpreted as signal-dependent narrow-band noise insertion (psy-model guided quantization of subbands), and perhaps phase-error? Are there no known mechanisms to guesstimate that such an error was inserted at one stage? I believe that natural music commonly contain spectrally sparse content (pure harmonic waveforms) or temporally coherent impulses (at least when rising in level). Can one not search for such things in a file, and find traces commonly attributed to mp3 encoding and with little chance of being generated via other means?

-k
Go to the top of the page
+Quote Post
2Bdecided
post Oct 17 2011, 10:42
Post #64


ReplayGain developer


Group: Developer
Posts: 5362
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



On exceptionally pure tone-like signals you can see the shape of the codec's noise skirting it.

I can't think of many recordings that contain spectrally sparse content. I went looking for some once when trying to assess the audibility of distortion. Even a solo flute or violin or piano is too rich to spot masking noise at high bitrates, and with most pop, rock, jazz etc you can forget it completely.

Pure impulses are a great test signal for ID-ing codecs, but only synthetic signals are known to be clean. With anything else, the pre-echo is usually lost in the other instruments. You could find it in isolation in some recordings, and maybe make a judgement that nothing else had caused it, but it doesn't sound like something you could automate.

If the recording you want to check contains no ultra-pure tone and no isolated impulse-like sounds, then this task is impossible IMO. You can't see the coding noise in the coded version when looking in the waveform view or the spectral view. Apart from the low pass, some of the common lossy distortions can be easier to hear than see.

Cheers,
David.
Go to the top of the page
+Quote Post
astroidmist
post Oct 21 2011, 23:00
Post #65





Group: Members
Posts: 34
Joined: 10-January 11
Member No.: 87208



QUOTE (Northpack @ Oct 8 2011, 05:00) *
QUOTE (astroidmist @ Oct 8 2011, 05:41) *
There used to be a freeware DOS command line/console program that could examine a WAV and tell if it came from an MP3.

It's called AuCDtect and does a spectrum analysis looking for patters introduced by lossy compression. It works very well, you'll hardly ever get a false negative. There a windows frontend called Tau Analyzer and even a foobar plugin, all avaiable here: http://en.true-audio.com


OK thanks very much for that. I haven't been able to find that program for years!


--------------------
opinion is not fact
Go to the top of the page
+Quote Post
djchristian
post Jan 6 2012, 03:50
Post #66





Group: Members
Posts: 43
Joined: 22-December 09
Member No.: 76233



QUOTE (_mē_ @ Oct 11 2011, 21:56) *
By false positive I mean that the file is fine, but detected as lossy.

Tau Analyser uses the same engine as auCDtect, so you can decode to wav and use auCDtect directly or automate it with fooCDtect or another frontend.


Where can i download fooCDtect?
Go to the top of the page
+Quote Post
db1989
post Jan 6 2012, 16:26
Post #67





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



Second result on Google (not sure which version): http://www.softpedia.com/get/Multimedia/Au...fooCDtect.shtml
Fourth result on Google (the same download is linked in a topic of our own, which is the first result): http://idle.netau.net/5099/foobar2000-fooc...i-for-aucdtect/
Go to the top of the page
+Quote Post
Ubulord
post Sep 1 2013, 11:27
Post #68





Group: Members
Posts: 6
Joined: 10-November 06
From: Portugal
Member No.: 37403



QUOTE (XeR0 @ Jun 29 2011, 18:58) *
I know the topic title sounds absolutely absurd but hear me out. I've tested my FLAC collection by encoding them into V0 MP3. I then decoded the MP3 back to WAV and then compressed it in FLAC. My question is: unless you've ripped the files yourself, how would you know the FLAC file you have is ACTUALLY lossless instead of an MP3 converted into FLAC? I've used the TEST option in FLAC frontend and it doesn't give a result. I have used Audiotester and it does say the file failed because it's TRUNCATED.

Bottom-line: Is there a sure-fire way of knowing that a FLAC file is truly lossless and not a derivative of a lossy file?


If the flacs you have correspond to a complete album you can test it with CUETools. If you have the flacs and a ".cue" file, CUETools will tell you if the whole rip correpsonds to a rip in the CTDB or AccurateRip databases or not. If it does, then I think you can be sure it's all genuine. If you don't have the ".cue" file, sometimes CUETools will also be able to check, sometimes it won't be able.
Go to the top of the page
+Quote Post
claudiod
post Sep 1 2013, 12:15
Post #69





Group: Members
Posts: 10
Joined: 1-September 13
Member No.: 109908



I have two J.S. Bach cds (Das Wohltemperierte Klavier I and II, Leonhardt, Harmonia Mundi/BMG Classics) which I believe to be legitimate, and sound perfectly.

Yet Audiochecker finds them to be 99% MPEG. It may have something to do with being a single instrument (harpsichord), so they may have used a lowpass filter to reduce noise in the higher frequencies where a cembalo is not supposed to be.

So we can't trust checking software completely.


This post has been edited by claudiod: Sep 1 2013, 12:15
Go to the top of the page
+Quote Post
Nessuno
post Sep 2 2013, 09:14
Post #70





Group: Members
Posts: 423
Joined: 16-December 10
From: Palermo
Member No.: 86562



QUOTE (claudiod @ Sep 1 2013, 13:15) *
I have two J.S. Bach cds (Das Wohltemperierte Klavier I and II, Leonhardt, Harmonia Mundi/BMG Classics) which I believe to be legitimate, and sound perfectly.

Yet Audiochecker finds them to be 99% MPEG. It may have something to do with being a single instrument (harpsichord), so they may have used a lowpass filter to reduce noise in the higher frequencies where a cembalo is not supposed to be.

So we can't trust checking software completely.

None around here has ever told to thrust this kind of software completely, in the first place.
All the more, generally harpsichord is one of the most challenging instruments for lossy codec and very revealing for humans in ABX tests, but those are analog recordings from late sixties or early seventies, so I think they could hardly be considered a valid reference.


--------------------
... I live by long distance.
Go to the top of the page
+Quote Post
Alexa
post Oct 11 2013, 19:01
Post #71





Group: Members
Posts: 22
Joined: 14-March 11
Member No.: 88991



Can anybody tell if these are both fake:





?

well first one is mp3 @ 192kbps smile.gif
but "aucdtect" reports mpeg 95% on the 2nd one too

I used "spectro" to make the screenshots tongue.gif

This post has been edited by Alexa: Oct 11 2013, 19:02
Go to the top of the page
+Quote Post
saratoga
post Oct 11 2013, 22:20
Post #72





Group: Members
Posts: 5156
Joined: 2-September 02
Member No.: 3264



QUOTE (Alexa @ Oct 11 2013, 14:01) *
Can anybody tell if these are both fake:


I doubt it. Compare the file you have to the lossless version from the CD.
Go to the top of the page
+Quote Post
Rescator
post Oct 12 2013, 02:44
Post #73





Group: Members
Posts: 74
Joined: 13-January 09
From: Trondheim
Member No.: 65515



And as pointed out earlier by folks, lowpass does not equal lossy encoding, I tend to filter out any audio I do not find valuable to the work I'm creating, in that case the master could actually be flagged as being "lossily encoded" etc.


--------------------
"Normality exist in the minds of others, not mine!" - Rescator
Go to the top of the page
+Quote Post
Glenda
post Oct 12 2013, 10:49
Post #74





Group: Members
Posts: 67
Joined: 27-November 07
Member No.: 49067



One other reason I like engineers who use the PM2 A-D, you can't have an MP3 HDCD file.

But there is a program that was good at detecting lossy gens call Audio Checker v1.2 by dester.
Go to the top of the page
+Quote Post
saratoga
post Oct 12 2013, 17:52
Post #75





Group: Members
Posts: 5156
Joined: 2-September 02
Member No.: 3264



QUOTE (Glenda @ Oct 12 2013, 05:49) *
you can't have an MP3 HDCD file.


Sure you can.

Edit: ah you mean mp3 transcoded to flag with hdcd.

This post has been edited by saratoga: Oct 12 2013, 18:26
Go to the top of the page
+Quote Post

4 Pages V  < 1 2 3 4 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 22nd December 2014 - 04:16