IPB

Welcome Guest ( Log In | Register )

3 Pages V   1 2 3 >  
Reply to this topicStart new topic
Reading Flac tags using C#
pkfox
post May 1 2014, 17:05
Post #1





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



Hi All, this is my first post so I don't even know if this is the correct forum to post this ?, I would like to read all my flac files ( 10000+ of them) and extract the tag information into a database - the database side of things is not a problem - understanding the flac documentation for the flac file format is :-(, does anyone know how to do this using c# preferably but I can use c/c++ if necessary - can't find out much Googling so thought I'd try here. TIA


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
ktf
post May 1 2014, 17:33
Post #2





Group: Members
Posts: 384
Joined: 22-March 09
From: The Netherlands
Member No.: 68263



Writing your own program is much to complicated IMO. Why not use metaflac? Depending on what operating system you are on, this is very easy. So, what format do want exactly (CSV?), how do you want to handle tags etc.


--------------------
Music: sounds arranged such that they construct feelings.
Go to the top of the page
+Quote Post
saratoga
post May 1 2014, 17:37
Post #3





Group: Members
Posts: 4996
Joined: 2-September 02
Member No.: 3264



Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging. Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/
Go to the top of the page
+Quote Post
nu774
post May 1 2014, 17:56
Post #4





Group: Developer
Posts: 529
Joined: 22-November 10
From: Japan
Member No.: 85902



This one is not specific to FLAC:
https://github.com/mono/taglib-sharp
Go to the top of the page
+Quote Post
ozok
post May 1 2014, 21:19
Post #5





Group: Members
Posts: 308
Joined: 9-December 12
From: Eskişehir
Member No.: 105075



Using mediainfo is an other option. I remember using a .net wrapper a long time ago. I'm not sure if this is it tho https://code.google.com/p/mediainfo-dot-net/
Go to the top of the page
+Quote Post
pkfox
post May 1 2014, 22:49
Post #6





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (ktf @ May 1 2014, 17:33) *
Writing your own program is much to complicated IMO. Why not use metaflac? Depending on what operating system you are on, this is very easy. So, what format do want exactly (CSV?), how do you want to handle tags etc.


Hi there and thanks for your reply, I'm a software man by profession so programming is not a problem but I can't seem to find a definitive explanation of the structure of a flac file, the documentation states the first 4 bytes of the header are supposed to read "fLaC" and then goes on to define all the "possible" other information that might or might not be there afterwards , all I need to know is where the "tags" begin and end but it doesn't seem to be written down anywhere


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
pkfox
post May 1 2014, 22:53
Post #7





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (saratoga @ May 1 2014, 17:37) *
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging. Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/

Hi and thanks but I need this to be hand rolled - I am completely willing to use tools but I absolutely "need" to understand the file structure to achieve what I'm thinking of doing


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
lvqcl
post May 1 2014, 22:59
Post #8





Group: Developer
Posts: 3397
Joined: 2-December 07
Member No.: 49183



QUOTE (pkfox @ May 2 2014, 01:49) *
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview
https://xiph.org/flac/format.html#metadata_..._vorbis_comment
Go to the top of the page
+Quote Post
saratoga
post May 1 2014, 23:32
Post #9





Group: Members
Posts: 4996
Joined: 2-September 02
Member No.: 3264



QUOTE (pkfox @ May 1 2014, 17:53) *
QUOTE (saratoga @ May 1 2014, 17:37) *
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging. Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/

Hi and thanks but I need this to be hand rolled


This is about the dumbest thing you can ever do with a media format, but the specs are on the flac website, so feel free to hang yourself smile.gif
Go to the top of the page
+Quote Post
pkfox
post May 2 2014, 06:50
Post #10





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (saratoga @ May 1 2014, 23:32) *
QUOTE (pkfox @ May 1 2014, 17:53) *
QUOTE (saratoga @ May 1 2014, 17:37) *
Never tried it in c#, but someone once ported a flac encoder to c# directly (see these forums), so perhap it included tagging. Alternatively, using the official tools in c# via p/invoke:

http://stoyanov.in/2010/01/08/encoding-unc...with-flac-in-c/

Hi and thanks but I need this to be hand rolled


This is about the dumbest thing you can ever do with a media format, but the specs are on the flac website, so feel free to hang yourself smile.gif


What is the dumbest thing ?


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
pkfox
post May 2 2014, 07:24
Post #11





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (nu774 @ May 1 2014, 17:56) *
This one is not specific to FLAC:
https://github.com/mono/taglib-sharp

Looks the best so far thank you


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
pkfox
post May 2 2014, 07:27
Post #12





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (lvqcl @ May 1 2014, 22:59) *
QUOTE (pkfox @ May 2 2014, 01:49) *
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview
https://xiph.org/flac/format.html#metadata_..._vorbis_comment


Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
pkfox
post May 2 2014, 07:29
Post #13





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (ozok @ May 1 2014, 21:19) *
Using mediainfo is an other option. I remember using a .net wrapper a long time ago. I'm not sure if this is it tho https://code.google.com/p/mediainfo-dot-net/

I will definitely check it out - thanks.


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
ktf
post May 2 2014, 08:14
Post #14





Group: Members
Posts: 384
Joined: 22-March 09
From: The Netherlands
Member No.: 68263



QUOTE (pkfox @ May 2 2014, 08:27) *
Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Uhhh... yes, they do, you probably aren't looking closely enough.

The streamheader starts with fLaC, then various metadata blocks follow. You only need the VORBIS_COMMENT block, which is blocktype 4, so you can skip the other blocks by looking in the metadata block header for their size. So, pseudocode would be

CODE
1. skip over stream header
2. loop
   2a. check whether this is a metadata block header if yes, keep on going, if no, error
   2b. check whether this is a vorbis_comment block, if yes, keep on going, if no, read the header size, skip over that and restart loop
   2c. read the vorbis_comment stuff into you database


But still, as some here pointed out, making use of a library might be more feature complete and future proof.

This post has been edited by ktf: May 2 2014, 08:15


--------------------
Music: sounds arranged such that they construct feelings.
Go to the top of the page
+Quote Post
nu774
post May 2 2014, 10:37
Post #15





Group: Developer
Posts: 529
Joined: 22-November 10
From: Japan
Member No.: 85902



That spec might be difficult to read unless you know how to read formal grammars.
The syntax is defined in top-down manner.

For example, a table named "STREAM" contains 4 rows("<32>", "METADATA_BLOCK", "METADATA_BLOCK*", "FRAME+").

This means that a STREAM (= whole flac file) consists of 32 bit magic number (fLaC), followed by a METADATA_BLOCK, followed by 0 or more METADATA_BLOCKs, followed by 1 or more FRAMEs.
You can read other tables in the same way.
Each table defines a non-terminal symbol like "STREAM" as a sequence of other symbols like METADATA_BLOCK, in the described order.

* and + are often used to describe repetitions (* means 0 or more, + means 1 or more). You should be familiar with it if you know regular expression or something.
Go to the top of the page
+Quote Post
lithopsian
post May 2 2014, 13:02
Post #16





Group: Members
Posts: 192
Joined: 27-February 14
Member No.: 114718



QUOTE (pkfox @ May 2 2014, 07:27) *
QUOTE (lvqcl @ May 1 2014, 22:59) *
QUOTE (pkfox @ May 2 2014, 01:49) *
but it doesn't seem to be written down anywhere

https://xiph.org/flac/format.html#format_overview
https://xiph.org/flac/format.html#metadata_..._vorbis_comment


Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Then you didn't read it carefully enough. That page defines exactly what the format of every single bit of data in the headers of a Flac file is and does. The important ones for you are the metadata blocks, but if you want to parse the file yourself then you'll need to understand the context they sit in, that is the format of the rest of the file.

Still no need for you to parse this entirely by hand. There is a perfectly good C Flac library which will read out metadat for you at the tag level, if existing tools are not sufficient.
Go to the top of the page
+Quote Post
lithopsian
post May 2 2014, 14:01
Post #17





Group: Members
Posts: 192
Joined: 27-February 14
Member No.: 114718



If you want a list of tags, there isn't a definitive one, but here are some useful links that cover most cases:
http://age.hobba.nl/audio/tag_frame_reference.html
http://age.hobba.nl/audio/mirroredpages/ogg-tagging.html
http://xiph.org/vorbis/doc/v-comment.html
https://wiki.xiph.org/Field_names
Go to the top of the page
+Quote Post
pkfox
post May 3 2014, 09:11
Post #18





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (nu774 @ May 1 2014, 17:56) *
This one is not specific to FLAC:
https://github.com/mono/taglib-sharp

That is what I'm using now v impressive so far.


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
pkfox
post May 3 2014, 09:18
Post #19





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (ktf @ May 2 2014, 08:14) *
QUOTE (pkfox @ May 2 2014, 08:27) *
Oh I've read those, nowhere does it state the location of the data I'm looking for, thanks anyway.

Uhhh... yes, they do, you probably aren't looking closely enough.

The streamheader starts with fLaC, then various metadata blocks follow. You only need the VORBIS_COMMENT block, which is blocktype 4, so you can skip the other blocks by looking in the metadata block header for their size. So, pseudocode would be

CODE
1. skip over stream header
2. loop
   2a. check whether this is a metadata block header if yes, keep on going, if no, error
   2b. check whether this is a vorbis_comment block, if yes, keep on going, if no, read the header size, skip over that and restart loop
   2c. read the vorbis_comment stuff into you database


But still, as some here pointed out, making use of a library might be more feature complete and future proof.


Hi, I've ended up using taglib_sharp which seems very good - when I have more time I'll dig into the code and see how they do it, I know the stream header starts with "fLaC" but where is the block type info and block size ?


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
lvqcl
post May 3 2014, 09:59
Post #20





Group: Developer
Posts: 3397
Joined: 2-December 07
Member No.: 49183



QUOTE (pkfox @ May 3 2014, 12:18) *
I know the stream header starts with "fLaC" but where is the block type info and block size ?


In METADATA_BLOCK_HEADER (and METADATA_BLOCK_HEADER is the first 32 bits of METADATA_BLOCK).
So you read first 32 bits from a file (they are equal to "fLaC"), then next 32 bits are METADATA_BLOCK_HEADER.

Also note that "All numbers are big-endian coded. All numbers are unsigned unless otherwise specified."
Go to the top of the page
+Quote Post
pkfox
post May 4 2014, 09:25
Post #21





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (lvqcl @ May 3 2014, 09:59) *
QUOTE (pkfox @ May 3 2014, 12:18) *
I know the stream header starts with "fLaC" but where is the block type info and block size ?


In METADATA_BLOCK_HEADER (and METADATA_BLOCK_HEADER is the first 32 bits of METADATA_BLOCK).
So you read first 32 bits from a file (they are equal to "fLaC"), then next 32 bits are METADATA_BLOCK_HEADER.

Also note that "All numbers are big-endian coded. All numbers are unsigned unless otherwise specified."


Hello, and thank you for your patience, I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ? I'm using c# at the moment but can use c++ if needs must. Thanks again for your help.


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
pkfox
post May 4 2014, 09:40
Post #22





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (nu774 @ May 2 2014, 10:37) *
That spec might be difficult to read unless you know how to read formal grammars.
The syntax is defined in top-down manner.

For example, a table named "STREAM" contains 4 rows("<32>", "METADATA_BLOCK", "METADATA_BLOCK*", "FRAME+").

This means that a STREAM (= whole flac file) consists of 32 bit magic number (fLaC), followed by a METADATA_BLOCK, followed by 0 or more METADATA_BLOCKs, followed by 1 or more FRAMEs.
You can read other tables in the same way.
Each table defines a non-terminal symbol like "STREAM" as a sequence of other symbols like METADATA_BLOCK, in the described order.

* and + are often used to describe repetitions (* means 0 or more, + means 1 or more). You should be familiar with it if you know regular expression or something.

I think you're right I don't understand the document, are you saying all METDATA_BLOCKS are 32 bits long ?


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post
nu774
post May 4 2014, 11:52
Post #23





Group: Developer
Posts: 529
Joined: 22-November 10
From: Japan
Member No.: 85902



You have to read that doc recursively, top -> bottom, just as if you were a recursive descendant parser.

METADATA_BLOCK, which first appears inside of STREAM definition, is defined on another table for METADATA_BLOCK (just after the STREAM table), where METADATA_BLOCK is defined as METADATA_BLOCK_HEADER followed by METADATA_BLOCK_DATA.

Since both of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA are not defined at the point, you have to seek for definition of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA in other places. Recursively means this... you have to continue this procedure until you read all the unknown elements defined at somewhere.

As for METADATA_BLOCK_HEADER, it is described just after the METADATA_BLOCK table, and now it's elements are all defined without using other undefined elements (1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata).
Go to the top of the page
+Quote Post
lvqcl
post May 4 2014, 12:42
Post #24





Group: Developer
Posts: 3397
Joined: 2-December 07
Member No.: 49183



QUOTE (pkfox @ May 4 2014, 12:25) *
I don't know if I'm reading the files correctly as the data after the "fLaC" marker is very odd looking ( I'm reading 4 bytes at a time into a byte array ) smiley faces and musical notes - do I need to convert these values ?

What did you expect to see - numbers in text format? FLAC is a binary format so you have to interpret these 32 bits correctly:
QUOTE (nu774 @ May 4 2014, 14:52) *
1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata



This post has been edited by lvqcl: May 4 2014, 12:44
Go to the top of the page
+Quote Post
pkfox
post May 4 2014, 12:54
Post #25





Group: Members
Posts: 23
Joined: 1-May 14
Member No.: 115884



QUOTE (nu774 @ May 4 2014, 11:52) *
You have to read that doc recursively, top -> bottom, just as if you were a recursive descendant parser.

METADATA_BLOCK, which first appears inside of STREAM definition, is defined on another table for METADATA_BLOCK (just after the STREAM table), where METADATA_BLOCK is defined as METADATA_BLOCK_HEADER followed by METADATA_BLOCK_DATA.

Since both of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA are not defined at the point, you have to seek for definition of METADATA_BLOCK_HEADER and METADATA_BLOCK_DATA in other places. Recursively means this... you have to continue this procedure until you read all the unknown elements defined at somewhere.

As for METADATA_BLOCK_HEADER, it is described just after the METADATA_BLOCK table, and now it's elements are all defined without using other undefined elements (1bit flag, followed by 7bit BLOCK_TYPE, followed by 24bit length of the metadata).


Hi there, thought I'd show you what I'm doing in code

CODE
public void ProcessTag(string filename)
        {
            int BlockSize = 4;
            int BytesRead = 0;

            UTF8Encoding encoding = new UTF8Encoding();
            byte[] block = new Byte[BlockSize];
            
            FileStream fs = new FileStream(filename, FileMode.Open);
            BinaryReader br = new BinaryReader(fs);
          
            while ((block = br.ReadBytes(4)) != null)
            {
                BytesRead += BlockSize;

                string s = encoding.GetString(block);
                // on first pass s = "fLaC" as expected
                // I'm only getting the string for debugging purposes.

                // on next pass where I would expect block flag and block type I get these values in the array
                // block[0] = "0", block[1] = "0",block[2] = "0", block[3] = "34"
                // Don't know what to do next ?
|           }

            fs.Close();
}


Thanks again for your patience.


--------------------
We can't stop here this is bat country - Hunter S Thompson RIP.
Go to the top of the page
+Quote Post

3 Pages V   1 2 3 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 30th September 2014 - 19:10