IPB

Welcome Guest ( Log In | Register )

8 Pages V  « < 3 4 5 6 7 > »   
Reply to this topicStart new topic
Ogg Vorbis acceleration project, Is it dead?
RazorBoy143
post Aug 6 2010, 22:25
Post #101





Group: Members
Posts: 28
Joined: 17-July 10
Member No.: 82340



Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.

This post has been edited by RazorBoy143: Aug 6 2010, 23:24
Go to the top of the page
+Quote Post
Fool_on_the_hill
post Aug 7 2010, 05:50
Post #102





Group: Members
Posts: 88
Joined: 30-October 05
From: Russia, Tomsk
Member No.: 25459



is it possible to make a universal encoder, which could recognize what SSE commands your processor supports?
Go to the top of the page
+Quote Post
forart.eu
post Aug 12 2010, 12:28
Post #103





Group: Members
Posts: 74
Joined: 10-December 09
From: italy
Member No.: 75798



Not only !

DarkWave Studio automatically select x86/x64 and correct SSE* instructions to use...
Go to the top of the page
+Quote Post
[JAZ]
post Aug 12 2010, 19:27
Post #104





Group: Members
Posts: 1793
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



@forat.eu: I guess you are aware that you cannot go with an "if" statement everytime you have to decide if you should use one technology or another, right?

With this, I mean that for an application to properly and automatically (or even manually via a setting) switch between using SSE or not, and not lose all gains the programmer has to write full independent paths for those operations. ("If" statements do really slow things).

Yet, this also means lots of work for the programmer, since he has to do manually what a compiler does automatically for him.
I guess the best way to do so for a multi-file program would be to have different dll's, each one with the same code, but compiled for different processors (by the compiler). The main program could choose to load one dll or another depending on where it is run. Anyway, this is not a perfect solution.


Also, when you throw in x86/x64, are you talking of an installer, or an application?? If it is an installer, the point is moot, since here we were talking about an executable program.

The only program that I've seen doing a sort of "Universal binary" for x86 and x64 is Microsoft's (or Mark russinovich's) Process Explorer. Version 11 of this software, when run on an x64 machine, extracts a x64 binary from itself and then executes that file (the downloaded file sizes 3.7MB. The x64 file sizes 950KB). So if you add up, there's clearly more space used than just having two separate files. It just makes it more handy when you're talking of small files.
Go to the top of the page
+Quote Post
galacticninja
post Aug 14 2010, 15:21
Post #105





Group: Members
Posts: 6
Joined: 31-August 09
Member No.: 72778



QUOTE (RazorBoy143 @ Aug 7 2010, 05:25) *
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.

Try the SSE2 compile here: (oggenc2.7z - http://www.hydrogenaudio.org/forums/index....mp;#entry668288 ; aoTuV beta 5.7 vorbis encoder with some parts of Lancer project ). I have been able to use this with dBpoweramp in a Windows XP computer with only up to SSE2 processor support.

Also, upon checking upon dBpoweramp's website, it seems that they've updated the available Ogg Vorbis lancer encoder installer for dBpoweramp to the b5.7 2009-03-03 version, from the previous b5 2006-10-24 version they had in the site a few months ago ( http://forum.dbpoweramp.com/showthread.php?t=18713 ). I do now know the source or compiler of this one, though.

This post has been edited by galacticninja: Aug 14 2010, 15:22
Go to the top of the page
+Quote Post
AshenTech
post Aug 17 2010, 17:55
Post #106





Group: Members
Posts: 78
Joined: 11-November 08
Member No.: 62144



QUOTE (RazorBoy143 @ Aug 6 2010, 15:06) *
QUOTE (AshenTech @ Aug 4 2010, 15:10) *
QUOTE (RazorBoy143 @ Aug 3 2010, 12:55) *
QUOTE (john33 @ Jul 24 2010, 00:28) *
QUOTE (Steve Forte Rio @ Jul 24 2010, 11:18) *
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip smile.gif


I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-(


try compat mode and set it to vista, see if that helps.


I don't understand. Could you be more specific?

2 ways to do this one is listed here

http://www.sevenforums.com/tutorials/316-c...ility-mode.html

http://lifehacker.com/5466628/learn-to-use...with-older-apps

hope this helps.
Go to the top of the page
+Quote Post
demi
post Aug 19 2010, 04:46
Post #107





Group: Members
Posts: 1
Joined: 18-August 10
Member No.: 83164



I tried ur build.

TEST SETUP:
CPU: AMD Athlon II X4 (208*14)
OS: Win7 64bit
Encoder: BS; (LancerMod [20100720](SSE3) based on aoTuV b5d [20090301])

I had convert flac to ogg q5. It seems generate correct files but cpu usage is abnormal.
I ran 4 encoder simultaneously, each process consume around 5% of cpu time.
So 4process consume just 20% CPU time. 80% is free.
It looks like lack of IO perf, but I think it doesnt matter cuz I tested on free 7200prm hdd.
SSE2 version also bring this problem.

In addition, I tested another encoder 'BS; (LancerMod [20091214](SSE3) based on aoTuV b5d [20090301])'
It works great and faster than john's earlier build. Peak speed up to 150x, fantastic!

I hope it will help john's work smile.gif

QUOTE (john33 @ Jul 20 2010, 20:32) *
Thanks for the feedback and suggestions. In the hope of resolving this, here are three compiles, this time with oggenc2.87:

SSE3 - http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip
SSE2 - http://www.rarewares.org/files/ogg/oggenc2...cerx64-SSE2.zip
SSE - http://www.rarewares.org/files/ogg/oggenc2...ncerx64-SSE.zip

I have to say that for standard length song tracks, ie., approx. 4 mins, there seems to be negligible speed difference between them on a q6600 @ 3.2GHz and 8GB DDR2 although any difference will no doubt be more apparent on a longer encoding exercise.

Feedback and experience with these would be welcome.

TIA.


This post has been edited by demi: Aug 19 2010, 05:10
Go to the top of the page
+Quote Post
Steve Forte Rio
post Sep 16 2010, 20:45
Post #108





Group: Members
Posts: 456
Joined: 4-October 08
From: Ukraine
Member No.: 59301



I've just tried accelerated oggenc on my new Core i3 . Here is short results:

Oggenc2.85 using aoTuVb5.7 P4 version - 36.79x
oggenc2.85-aoTuVb5.7-Lancer - 58.14x

Windows 7 x32, Core i3 530 @ 2.94GHz, 2x2 Gb DDR3-1333

Great speedup, thanks for your work smile.gif


P.S. Maybe this is a stupid question but is it possible to use SSE4.1/4.2 optimizations that are available with latest Intel CPU's?

This post has been edited by Steve Forte Rio: Sep 16 2010, 20:46
Go to the top of the page
+Quote Post
AlexDDR
post Oct 7 2010, 23:29
Post #109





Group: Members
Posts: 16
Joined: 11-February 10
Member No.: 78064



Is there a version of aotuv b5.7? oggenc or vorbis.dll with SSE3 mt (multi thread), it seems to only find the normal version
Go to the top of the page
+Quote Post
IgorC
post Jan 2 2011, 21:39
Post #110





Group: Members
Posts: 1577
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Two samples have audible distortion with Lancer encoder (john33 builds)

http://www.hydrogenaudio.org/forums/index....showtopic=85933

lvqcl builds have no issues.
Go to the top of the page
+Quote Post
punkrockdude
post Jan 22 2011, 22:14
Post #111





Group: Members
Posts: 256
Joined: 21-February 05
Member No.: 20022



I would love an updated enhanced ogg encoder too. The latest libogg and all that and SSE3 and SSE4. What would be even better would be a multicore & sse4 version. Regards
Go to the top of the page
+Quote Post
lvqcl
post Mar 5 2011, 15:52
Post #112





Group: Developer
Posts: 3418
Joined: 2-December 07
Member No.: 49183



QUOTE (IgorC @ Jan 2 2011, 23:39) *
Two samples have audible distortion with Lancer encoder (john33 builds)

http://www.hydrogenaudio.org/forums/index....showtopic=85933

lvqcl builds have no issues.

Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

(I wonder why algorithms in this file are so sensitive to optimizations made by ICC)

This post has been edited by lvqcl: Mar 5 2011, 15:55
Go to the top of the page
+Quote Post
lvqcl
post Mar 7 2011, 14:47
Post #113





Group: Developer
Posts: 3418
Joined: 2-December 07
Member No.: 49183



QUOTE (lvqcl @ Mar 5 2011, 17:52) *
Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

Note2: the problem was in the code
CODE
    e->mdct_win[i]=sin(i/(n-1.)*M_PI);
    e->mdct_win[i]*=e->mdct_win[i];

ICC at highest optimization level doesn't generate code for the second line... Replacing it with the following code solves this problem:

CODE
    float t = sin(i/(n-1.)*M_PI);
    e->mdct_win[i] = t*t;


This post has been edited by lvqcl: Mar 7 2011, 14:52
Go to the top of the page
+Quote Post
robert
post Mar 8 2011, 10:19
Post #114


LAME developer


Group: Developer
Posts: 788
Joined: 22-September 01
Member No.: 5



Is this a known bug of the Intel compiler? Did it print some warnings about unsafe optimizations used, so that one has the chance to see the problem coming? I guess, I'll have to do some code review of LAME, looking out for similar potential problems.
Go to the top of the page
+Quote Post
lvqcl
post Mar 8 2011, 11:05
Post #115





Group: Developer
Posts: 3418
Joined: 2-December 07
Member No.: 49183



QUOTE
Did it print some warnings about unsafe optimizations used

Don't see any.

But I also noticed that "Interprocedural Optimization" option was set to Multi-File (/Qipo). Changing this option for envelope.c to Single-File (/Qip) solves this problem, too.

icl.exe: Version 12.0.2.154 Build 20110112

Added: http://software.intel.com/en-us/forums/sho...ead.php?t=62095 -- "Bug in Intel C++ compiler when using option /Qipo ... Intel C++ v11.0.066"


Added [20110505]: The bug still exists in Intel® C++ Composer XE 2011 Update 3 (icl.exe Version 12.0.3.175 Build 20110309)

This post has been edited by lvqcl: May 7 2011, 18:17
Go to the top of the page
+Quote Post
Isayama
post Feb 5 2012, 20:05
Post #116





Group: Members
Posts: 7
Joined: 5-February 12
Member No.: 96948



Hi everyone,

Sorry for re-upping this nearly one year old post, but I was wondering, related to this thread (aTuVbeta6.02): where can I download the fastest (and latest) SSE3 (or SSE2, or even SSE4.1 rolleyes.gif ) 64-bit version of accelerated oggenc2? lvqcl's one is not online anymore (from this post). Has any new advancement been realized in that field?

Thanks anyway for all those interesting discussions.
Go to the top of the page
+Quote Post
lvqcl
post Feb 5 2012, 21:53
Post #117





Group: Developer
Posts: 3418
Joined: 2-December 07
Member No.: 49183



AoTuV b6.03 compiled with ICC 12.1: Attached File  oggenc2_ICC12.1.7z ( 689.38K ) Number of downloads: 536

32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

Attached File  sources_.7z ( 356.05K ) Number of downloads: 226


This post has been edited by lvqcl: Feb 6 2012, 19:00
Go to the top of the page
+Quote Post
skamp
post Feb 6 2012, 01:10
Post #118





Group: Developer
Posts: 1450
Joined: 4-May 04
From: France
Member No.: 13875



Thank you very much for the updated binaries. With the Win64 SSE3 binary under linux with wine, I get 59x, versus 37x with my natively compiled aotuv binary.

This post has been edited by skamp: Feb 6 2012, 01:20


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
forart.eu
post Feb 7 2012, 10:09
Post #119





Group: Members
Posts: 74
Joined: 10-December 09
From: italy
Member No.: 75798



Dunno if can help in any way, but here's an Eric Gur (Processor Client Application Engineer @ Intel Corp.) reply to my message about MT library:

QUOTE
For threading I recommend using Intel's free TBB library. It's very fast, cross platform, simple to use and has an important feature - malloc replacement.
I used it in a previous project - 1M lines of code, multithreaded application on Linux x64. Just the malloc replament boosted performance by 3x without changing any code (1 line in the makefile).


BTW, there are a number of malloc replacements available, including this and one from Google...

This post has been edited by forart.eu: Feb 7 2012, 10:54
Go to the top of the page
+Quote Post
Isayama
post Feb 8 2012, 16:08
Post #120





Group: Members
Posts: 7
Joined: 5-February 12
Member No.: 96948



QUOTE (lvqcl @ Feb 5 2012, 21:53) *
AoTuV b6.03 compiled with ICC 12.1

Many thanks lvqcl for your quick and effective answer! I don't know anything yet about compiling, but I think I'll start giving it a shot... I've seen you gave your optimization options in another thread, so I'll start with that :-)
Sorry to ask, but is there any storage site or ftp server where you upload your compiles, or do you do it on an on-demand basis? :-)

Thanks again anyway, now I can encode an album in ogg in no time, which was kind of a problem so far.
Good continuation and cheers for the help!
Go to the top of the page
+Quote Post
lvqcl
post Feb 20 2012, 17:37
Post #121





Group: Developer
Posts: 3418
Joined: 2-December 07
Member No.: 49183



TWIMC -- aoTuV b5.7 compiled with ICC 12.1. Attached File  oggenc2_ICC12.1_aotuv57.7z ( 674.91K ) Number of downloads: 250
Go to the top of the page
+Quote Post
vinnie97
post Mar 28 2012, 22:06
Post #122





Group: Members
Posts: 472
Joined: 6-March 03
Member No.: 5360



Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit). I receive an "Unsupported Compression Method" error when attempting to decompress. Any clues?
Go to the top of the page
+Quote Post
saratoga
post Mar 28 2012, 23:23
Post #123





Group: Members
Posts: 5054
Joined: 2-September 02
Member No.: 3264



QUOTE (vinnie97 @ Mar 28 2012, 17:06) *
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit). I receive an "Unsupported Compression Method" error when attempting to decompress. Any clues?


Are you running a version of 7zip before 9.04 ? if so, update, as thats when LZMA2 support was added.
Go to the top of the page
+Quote Post
OggY68
post Apr 5 2012, 20:51
Post #124





Group: Members
Posts: 8
Joined: 8-July 09
From: Brussels
Member No.: 71300



QUOTE (lvqcl @ Feb 5 2012, 21:53) *
AoTuV b6.03 compiled with ICC 12.1: Attached File  oggenc2_ICC12.1.7z ( 689.38K ) Number of downloads: 536

32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

Attached File  sources_.7z ( 356.05K ) Number of downloads: 226


Thanks, managed to compile your sources using GCC with SSE3 acceleration as shared libraries (libvorbis, libvorbisenc and libvorbisfile) natively on my linux box. Encoding a CD to ogg takes now 30 seconds less on my old Athlon X2 4600XP. But, unfortunately oggdec and ogg1234 cannot decode anymore with the new libvorbisfile lib. After the method ov_raw_seek is called the programs exit with a "Segmentation Fault". After this method is called for the first time, the data members of vorbis_info like mode, rate, ... show only junk numbers like "-1223863434" hence the programs crash ...

This post has been edited by OggY68: Apr 5 2012, 20:51
Go to the top of the page
+Quote Post
LigH
post Apr 28 2012, 12:55
Post #125





Group: Members
Posts: 162
Joined: 20-November 01
Member No.: 503



I am simply amazed about the speed gain!

I compared different encoders, and for Ogg Vorbis, specifically, several specific builds, encoding a whole CD image (697 MB) on an AMD Phenom II X6 1045T, 2700 MHz. Times taken with ptime; best of 3 consecutive runs.



Germany uses a decimal comma. Basic oggenc2 builds are from RareWares.

That shouldn't leave any doubt that Ogg Vorbis, fine tuned by Lancer, is now probably the practically most efficient audio encoder, regarding a weighted relation between quality efficiency and speed efficiency. The FhG AAC encoder is close, but lacks of bitrate tuning flexibility (quite a large gap between VBR presets 4 and 5, targeting at 128 or 192 kbps).

This post has been edited by LigH: Apr 28 2012, 13:08


--------------------
http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum
Go to the top of the page
+Quote Post

8 Pages V  « < 3 4 5 6 7 > » 
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 31st October 2014 - 09:36