IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Pattern-Configurable Duplicate Music Finder/Remover
Marc27
post Jan 28 2013, 22:23
Post #1





Group: Members
Posts: 40
Joined: 5-May 11
Member No.: 90377



I'm aware of applications such as Similarity that scan your media library and detect duplicated music based on different critherias such as checksums or information stored on ID3 tags. But they are limited in the scope as they don't provide the means to fine-tune the critheria or define a custom one. You can expect better results if the critheria is fine-tuned according to the categorization, naming, tagging scheme to name some used in your media library.

From another informational perspective it would be interesting to know in an as accurate as possible way how much redudancy is in your music collection. For example is you have large VA Compilations, different releases of the same album (1st masters, remasters, vinyl), CD singles & Maxi CDs, Radioshows and Live Sessions. In a more broader sense and pure informational porpuse only it would be interesting to know as well what would be the level of redundacy if you consider remixes or different versions of a same track as redundant copies (aren't they at some degree?). For example if you are a Beatles fan, and you have all the possible versions of a given album or further a single track, including bootlegs, this track would feature a higher "redundancy level" in your music collection. What would be the tracks, albums or artists with a higher redundacy level" in your media library?
Go to the top of the page
+Quote Post
Porcus
post Jan 6 2015, 00:21
Post #2





Group: Members
Posts: 2154
Joined: 30-November 06
Member No.: 38207



Duplicate finders have been a recurring topic with lots of recommendations which I hope are outdated as of 2015 when a lot of utilities utilize musical fingerprints.

This "Pattern-Configurable" was likely the topic title closest to what I am looking for:


-> Check bit-by-bit by audio content
... and warn if one is corrupted;
If file1.MP3 and file2.MP3 are the same but one has wrong length information, then I also want to know - I have made the mistake of rewriting noncompliant mp3 headers and losing gaplessness.
If flac (Reference) decodes file7.FLAC and file8.FLAC to the same but tells me that file7.FLAC is invalid, then most likely file8.FLAC is a transcode and I might not want to destroy the evidence.


-> Check bit-by-bit for extract ("substring") or overlap?
Is audio stream A an extract (time T1 to T2) of stream B? Do streams A and B equal except A has some extra samples in the beginning and B in the end? Are these zeroes? (Offset-correction, y'know ...)
This should not be too hard given that one can reduce the number of possible matches by fingerprinting.


-> Relax bit-exactness only-so-slightly
(Is file.WAV really a decoded file.MP3? One may want to tolerate roundoff error, but nothing more than that ... and possibly some samples beginning or ending due to the gaplessness issue, right?)

This post has been edited by Porcus: Jan 6 2015, 00:23
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 5th May 2015 - 05:26