IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
MP3 to OGG
different
post Oct 30 2010, 22:25
Post #1





Group: Members
Posts: 4
Joined: 30-October 10
Member No.: 85081



I spent an hour reading this board before deciding to registering an account and posting this question. I am undergrad computer science student working on a project that will convert a MP3 to the OGG format using CUDA C. CUDA C is one of the languages that allows one to run programs on the NVidia GPU.

The project that was assigned to me was given a MP3 file, convert it to the OGG format. I never had to write an audio program before but my understanding since both the MP3 and OGG formats are lossy that it is not recommended to convert directly from MP3 to OGG. Anyways, the question I have is how should I start doing this assignment? I spent sometime looking online for algorithms to rewrite to run on the GPU but I couldn't come across any - I just found existing DLL files to use.

Thanks for any input given.
Go to the top of the page
+Quote Post
Ouroboros
post Oct 30 2010, 23:16
Post #2





Group: Members
Posts: 291
Joined: 30-May 08
From: UK
Member No.: 53927



QUOTE (different @ Oct 30 2010, 22:25) *
I spent sometime looking online for algorithms to rewrite to run on the GPU but I couldn't come across any - I just found existing DLL files to use.

I hope your coding is better than your searching. You are trying to decode mp3 and encode ogg, so why not start by using Google to search for mp3 decoder source code, then ogg encoder source code. I've just done it, it took about 10 seconds, and in both cases the result you want is in the top five.
Go to the top of the page
+Quote Post
stephanV
post Oct 30 2010, 23:50
Post #3





Group: Members
Posts: 394
Joined: 6-May 04
Member No.: 13932



Also, other then "just because I can and I get a grade for it" I don't really see the point of this project. If you are going to write something make it at least somewhat useful. There really isn't anything more interesting to do with CUDA?

This post has been edited by stephanV: Oct 30 2010, 23:51


--------------------
"We cannot win against obsession. They care, we don't. They win."
Go to the top of the page
+Quote Post
different
post Oct 31 2010, 01:13
Post #4





Group: Members
Posts: 4
Joined: 30-October 10
Member No.: 85081



@Ouroboros,

Thanks for the search term tip! Actually, I never type in the words "source code" so perhaps that was the key.

@stephanV,

The main point of the project is to learn the advantages/disadvantages of using a GPU to solve problems. The initial part of this project is to work with audio files and once that is achieved, I am moving to videos. Overall, a open-source program will be written that would allow one to take a MPEG or AVI file to convert it to the OGG format in the least amount of time. To the end-user, the decrease time of converting video/audio is the usefullness. If you have any multimedia ideas in mind then I am all ears.
Go to the top of the page
+Quote Post
viktor
post Oct 31 2010, 01:29
Post #5





Group: Members
Posts: 297
Joined: 17-November 06
Member No.: 37682



QUOTE (different @ Oct 31 2010, 02:13) *
The main point of the project is to learn the advantages/disadvantages of using a GPU to solve problems. The initial part of this project is to work with audio files and once that is achieved, I am moving to videos. Overall, a open-source program will be written that would allow one to take a MPEG or AVI file to convert it to the OGG format in the least amount of time. To the end-user, the decrease time of converting video/audio is the usefullness. If you have any multimedia ideas in mind then I am all ears.


the main point is, transcoding is bad for lossy.
Go to the top of the page
+Quote Post
db1989
post Oct 31 2010, 09:14
Post #6





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



QUOTE (different @ Oct 30 2010, 22:25) *
my understanding since both the MP3 and OGG formats are lossy that it is not recommended to convert directly from MP3 to OGG.
Just to clarify, since this comes up a lot, there’s no such thing as a “direct” conversion between two different formats. The user may not always be made aware of the process, but the conversion always involves, as Ouroboros said, decoding the source file to uncompressed audio (PCM or a containerised variant thereof such as WAV or AIFF) and sending this to the destination encoder.
Go to the top of the page
+Quote Post
[JAZ]
post Oct 31 2010, 10:02
Post #7





Group: Members
Posts: 1787
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



Every time I read someone that has to make a project for a degree in the university where the task is comparable to make a commercial program, I wonder if the teacher is (ab)using their students, or if neither the teacher nor the student understand what they are going to do.

So, you want to make an mp3/mp2 audio decoder (MPEG video uses mp2, avi with divx/xvid tend to use mp3, but they can also contain the AC3 track from the original DVD), an MPEG 2 video decoder, an MPEG4 video decoder (Divx/Xvid is basically mpeg4 video), an OGG encoder and then, I guess, a theora or WebM video encoder.

All that in CUDA, and... in how many man hours?

A perceptual encoder (and here we're talking about the good ones) is not like making a .zip compressor. The task you describe implies understanding a lot of concepts, and even in that case, the task is big. Basically, you want to make FFMPEG. (except with not that many encoders/decoders).
Go to the top of the page
+Quote Post
Ouroboros
post Oct 31 2010, 13:02
Post #8





Group: Members
Posts: 291
Joined: 30-May 08
From: UK
Member No.: 53927



Perhaps the aim of the exercise is to learn how to use CUDA C, or how to port existing code to CUDA C to run on the GPU. If it is, then the reason for picking what seems to be such a futile application (MP3 to Ogg) is precisely because it is so practically useless that it is unlikely to already exist anywhere on the Internet, so there is little opportunity for just "borrowing" some previously published code, thereby forcing the students to do the porting themselves.

I can remember producing similarly pointless code as an undergraduate, where the purpose wasn't the actual functionality but learning how to use a particular method or set of functions.
Go to the top of the page
+Quote Post
[JAZ]
post Oct 31 2010, 13:37
Post #9





Group: Members
Posts: 1787
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



@Ouroboros:
I disagree. There are a lot of transcoders, and moreso in video. Be the MediaCoder frontend a clear example.
The main use of transcoding is to get a smaller file than the original (MPEG2 2Mbit -> MPEG4 600Kbit, MP3 192 -> AAC 128, etc...). In video, the visual size may also be reduced (example: to use in an iPod Touch or other MP4 players/phones).

The project as it has been described is paramount for a single student as a part of his studies. If he started with an existing code, and was more focused, then it might be reasonable.
I don't know if he's studying just computer science, or if the carreer is specifically about Digital Signal theory, where he would have learned the concepts of perceptual codecs, fourier transforms, windowing and all the DSP and transformation related subjects.

At least when I studied (and that's 7 years ago), I only saw the fourier transform in an optional course, at the end, and was never mentioned in relation to digital audio.
(If anyone wonders about uses of the fourier transform in other subjects than audio, just think on what an ADSL connection is)

This post has been edited by [JAZ]: Oct 31 2010, 13:42
Go to the top of the page
+Quote Post
different
post Oct 31 2010, 19:40
Post #10





Group: Members
Posts: 4
Joined: 30-October 10
Member No.: 85081



@dv1989:

thanks for clarifying this. I was in the majority that had this misconception.

@[JAZ]:

if this has any value, I really don't know how many man hours this will take me. I have roughly 3 months to do this.

And, to clarify, I am just studying computer science. The only experience I had with fourier transform was in my differential equation classes. Since taken those classes, I never had to use them.

@Ouroboros:

to [JAZ]'s point there can actually be some practical value of the project depending on what I'm able to pull off. There is a commercial application called "Badaboom" that claims to "create iPod and PSP video up to 20x faster" via http://badaboomit.com/. As you may have guess, it gets these performance by using the GPU. So, in other words, with my open source project a user can save "a lot" of time transcoding files.

This post has been edited by different: Oct 31 2010, 20:05
Go to the top of the page
+Quote Post
Bullit
post Oct 31 2010, 23:09
Post #11





Group: Members
Posts: 42
Joined: 6-October 10
Member No.: 84390



QUOTE (different @ Oct 31 2010, 20:40) *
So, in other words, with my open source project a user can save "a lot" of time transcoding files.


But who's going to want a mp3 to ogg transcoder?

Regular people (non techines) just use mp3 as it is. Only techies for the most part like to use ogg (like the people on this forum) and they know better than to transcode files. Regular people also don't use CUDA, and probably have no idea what a GPU is. Also, CUDA is not as popular as it could be, so all this means your target audience is like 12 people...

Have you considered using the new DirectCompute from Microsoft for GPU acceleration? It works on ATi as well as Nvidia cards with at least DX10. And you also have OpenCL which is open platform and works on Mac and Linux too.


If you want to make it useful, I'm going to suggest wav>ogg encoding or another lossless format > ogg. At least people here will have use for it and maybe your project will continue in the community once it hits open source.

If you're going to put in so much effort at least make it something that more people can use.

My $0.02

This post has been edited by Bullit: Oct 31 2010, 23:12
Go to the top of the page
+Quote Post
different
post Oct 31 2010, 23:36
Post #12





Group: Members
Posts: 4
Joined: 30-October 10
Member No.: 85081



Hi Bullit,

Thanks for feedback.

I vaguely remember DirectCompute when it was announced awhile back. I keep my options open so I will look at this again. I remember reading about OpenCL, but unforunately I need to use CUDA C for the first iteration of my program. However, I will keep my code flexible to allow "quick" support for OpenCL.

Based on what people have said in this thread and additional reading that I have done since, this is a rough-draft of what I will be doing in this order.

- Convert a WAV to OGG using the Vorbis codec where most of the work will be done on the GPU.
- Convert a AVI to OGG using the Theora codec where most of the work will be done on the GPU.

Based on my experience and what I learn implementing these two items, I will then decide what to do next.
Go to the top of the page
+Quote Post
Bullit
post Oct 31 2010, 23:48
Post #13





Group: Members
Posts: 42
Joined: 6-October 10
Member No.: 84390



QUOTE (different @ Oct 31 2010, 23:36) *
- Convert a WAV to OGG using the Vorbis codec where most of the work will be done on the GPU.
- Convert a AVI to OGG using the Theora codec where most of the work will be done on the GPU.


Here's the thing about AVI though. It's BIG. Lossless AVI files are huge, and lossless compressed AVI files are still 50% huge.

If I were you, I would consider WAV > OGG simply because the files are smaller and more manageable.

I'm assuming you're using Theora because it's very open source and alot of documentation is available for it. As a codec, Theora is pretty sucky in terms of quality and compression but it's fast so I guess it's good for streaming content... it's still better than what youtube is using at least. x264 is miles ahead in quality but I can't comment if it's more difficult for you to implement compared to Theora. Both of these options would be more time consuming and more difficult than just audio compression. Video compression is hard stuff.

If you're going with option 2, Doom9.org is a great forum if you're looking for assistance. I know some x264 developers frequent it, but I don't know about the Theora guys. The boys at Doom9 can point you in the right direction whatever video codec you're looking into.


Hope that helps. Best of luck with your project.

This post has been edited by Bullit: Nov 1 2010, 00:07
Go to the top of the page
+Quote Post
Kohlrabi
post Nov 1 2010, 15:41
Post #14





Group: Super Moderator
Posts: 1084
Joined: 12-March 05
From: Kiel, Germany
Member No.: 20561



Just a note on the video stuff. There is no such thing as "encoding AVI" or "decoding AVI", since AVI is an old container for multimedia content, just like OGG is the container used prominently for Vorbis audio and/or Theora video. Writing a converter from "AVI" to "OGG Video" would involve developing several decoders for all the stuff that can be in AVI, and a Theora encoder in CUDA C. As mentioned before, this basically means porting FFMPEG to CUDA. I can only agree with the other people here that porting an existing Vorbis encoder to CUDA C is a big enough task, and unless you're extremely good at that you can probably forget about writing your own MP3 decoder in CUDA.

To be frank, the wording of the initial post makes me wonder whether you and especially your advisor knows what you're up to here. Still a nice effort to try, and I wish you best of luck. You should check out the Vorbis reference encoder libvorbis on Xiph's site, it might also be useful to take a look at the rockbox project for pointers. Like the people mentioned before, it's probably best to forget about the MP3 decoding step for starters.

This post has been edited by Kohlrabi: Nov 1 2010, 15:54


--------------------
Ceterum censeo Masterdiskem esse delendam.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 25th October 2014 - 02:52