IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Are there any x64 builds of LAME?
PsychicHigh
post Aug 9 2006, 05:26
Post #1





Group: Members
Posts: 1
Joined: 9-August 06
Member No.: 33829



Just wondering if there are any x64 builds of LAME around. Mostly of the recommended beta, or maybe previous recommended version of LAME.

Thanks in advanced.
Go to the top of the page
+Quote Post
Gabriel
post Aug 9 2006, 08:18
Post #2


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



Win64 support has been added last week in the VC8 projects under CVS, but this is only for 3.98alpha versions.
Go to the top of the page
+Quote Post
twist3d
post Aug 9 2006, 09:52
Post #3





Group: Members
Posts: 11
Joined: 1-November 05
Member No.: 25488



actually someone has built a 64-bit version (I don't know if it works as I'm using still regular win xp).
http://okejl.dk/dunstan/ has 64-bit builds on various audio and video tools.
Go to the top of the page
+Quote Post
Maurits
post Aug 9 2006, 10:10
Post #4





Group: Members
Posts: 392
Joined: 30-September 05
From: London, Europe
Member No.: 24805



Are there any advantages of using 64bits over 32bits? Just wondering...
Go to the top of the page
+Quote Post
cabbagerat
post Aug 9 2006, 14:10
Post #5





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



QUOTE (Maurits @ Aug 9 2006, 01:10) *
Are there any advantages of using 64bits over 32bits? Just wondering...
Not that I can see - on Linux (with both GCC 3.3 and 4.1) the 32 bit version without MMX is about 10% faster than the 64 bit version without MMX. I would put this performance difference down to a higher number of cache misses with the 64bit version (some loops stop fitting in cache because the code is bigger).

The picture might be different on Windows, but I don't expect it to change much. I would stick to the 32bit version for the time being.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
Gabriel
post Aug 9 2006, 14:25
Post #6


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



When comparing those two VC8 builds:

*32bits, Nasm and intrinsics optims
*64bits, intrinsics optims

then the 64bits one is about 10% faster (AMD processor).
Go to the top of the page
+Quote Post
robert
post Aug 9 2006, 14:27
Post #7


LAME developer


Group: Developer
Posts: 788
Joined: 22-September 01
Member No.: 5



do both compiles use the same SSE extensions?
Go to the top of the page
+Quote Post
Gabriel
post Aug 9 2006, 15:30
Post #8


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



No, the x86 build is using Nasm optims (mmx and 3dnow) AND intrinsics (sse), while the x64 build is only using intrinsics (sse).

So in this case, the x64 version is using fewer "hand-made" optims but is still faster.
Of course, it is not magical. It is because of optims made by the compiler (additionnal registers, and perhaps "heavy" SSE/SSE2 use)

This post has been edited by Gabriel: Aug 9 2006, 15:32
Go to the top of the page
+Quote Post
robert
post Aug 9 2006, 15:39
Post #9


LAME developer


Group: Developer
Posts: 788
Joined: 22-September 01
Member No.: 5



QUOTE (Gabriel @ Aug 9 2006, 16:30) *
So in this case, the x64 version is using fewer "hand-made" optims but is still faster.
Of course, it is not magical. It is because of optims made by the compiler (additionnal registers, and perhaps "heavy" SSE/SSE2 use)

If MVC8 compiles for x64 architecture, does this imply SSE2? If this is the case, you could compare it with x32 compiled with SSE2 enabled.
Go to the top of the page
+Quote Post
bubka
post Aug 9 2006, 16:02
Post #10





Group: Members
Posts: 239
Joined: 21-July 02
Member No.: 2692



is there a 32bit build with say SSE2?


--------------------
Chaintech AV-710
Go to the top of the page
+Quote Post
Gabriel
post Aug 9 2006, 16:05
Post #11


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



I think that x64 implies availability of SSE/SSE2.
It is likely that VC8 used some SSE/SSE2 code, but this is only a supposition, as both compiles are using default settings of VC8.

Of course, more tests are needed to reach a real conclusion.
Go to the top of the page
+Quote Post
Garf
post Aug 9 2006, 16:42
Post #12


Server Admin


Group: Admin
Posts: 4886
Joined: 24-September 01
Member No.: 13



32 bit builds will not use SSE/SSE2 unless specifically indicated in the settings

64 bit builds *need* to use SSE/SSE2 since it is the only supported mode of floating point operation
Go to the top of the page
+Quote Post
Gabriel
post Aug 9 2006, 16:49
Post #13


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



QUOTE
32 bit builds will not use SSE/SSE2 unless specifically indicated in the settings

Sure

QUOTE
64 bit builds *need* to use SSE/SSE2 since it is the only supported mode of floating point operation

This is a programmer's urban legend. The fpu is still there, and still usable in x64.
Is VC8 using SSE/SSE2 by default under x64? Very likely. But it might also be using a mix of SSE and x87.
Go to the top of the page
+Quote Post
Gabriel
post Aug 9 2006, 22:17
Post #14


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



I uploaded VC8 32 and 64 builds there:
http://gabriel.mp3-tech.org/lame/x64/

Strange thing is that enabling sse2 decreases speed of the x86 compilation.
Go to the top of the page
+Quote Post
bubka
post Aug 9 2006, 22:50
Post #15





Group: Members
Posts: 239
Joined: 21-July 02
Member No.: 2692



damn, but thanks, not that speed is that big of a deal


--------------------
Chaintech AV-710
Go to the top of the page
+Quote Post
chelgrian
post Aug 10 2006, 00:20
Post #16





Group: Members
Posts: 316
Joined: 27-April 03
Member No.: 6228



QUOTE (Gabriel @ Aug 9 2006, 16:49) *
This is a programmer's urban legend. The fpu is still there, and still usable in x64.
Is VC8 using SSE/SSE2 by default under x64? Very likely. But it might also be using a mix of SSE and x87.


The MS documentation explicitly states that the only FP instructions available in long mode are SSE[1|2|3]. If the compiler is generating x87 in long mode then that is a bug.

The MS compiler hhas been using 64bit IEEE floating point by default for years rather than 80bit however it doesn't generate SSE for 32 bit code unless told to explicitly since code targeted for "blend" must still run on PII class processors which don't have SSE.

The FPU may be there, and you may even be able to execute x87 instructions in long mode on K8 and P4 or Conroe processors. However this is not the same as being specified as part of the instruction set. The x87 opcode space along with MMX and 3dNow is UNDEFINED when in long mode. Futures AMD64 processors may redefine this opcode space for new instructions.

Using x87,MMX or 3dNow in long mode is therefore hideously bad as your binaries may not work correctly under future processors which implement AMD64.

Having said that MS did change the NT kernel to save and restore the x87 registers across context switches before they released the AMD64 version of NT. Presumably they have user space code which breaks the specification and uses x87 in long mode so they had to bodge things. NT does not preserve them for kernel only context switches so using x87 in a device driver under NT for AMD64 would be a quick road to a blue screen of death.
Go to the top of the page
+Quote Post
Gabriel
post Aug 10 2006, 08:33
Post #17


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



x87, mmx and even 3dnow are still there.
(see "AMD64 Architecture Programmerís Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions")

x87 context is preserved in user space (sure), but I am unsure about kernel space. Anyway, the OS would have to blank the x87 registers when switching context if it did not preserved it, so it doesn't cost much more to save and restore it properly. (not saving and not blanking would lead to a security issue)

Regarding VC8, I found the answer:
QUOTE
The feature to see the legacy x87 registers for x64 applications is not available in VS2005. The main reason for this is that originally the x87 registers were not available for 64bit applications at all. We later did make them available, but only accessible through MASM.

Due to the late addition of the support for x87 and limited use cases, we weren't able to get it into the VS2005 product.

So to answer your question, you can't see them in the VS2005 debugger. You can use the 64bit version of WinDbg though, if that helps.

Thanks for your feedback.

Kang Su Gatlin
Visual C++ Program Manager

So VC8 is not creating code that uses x87 on its own behalf.

This post has been edited by Gabriel: Aug 10 2006, 08:34
Go to the top of the page
+Quote Post
cabbagerat
post Aug 10 2006, 12:39
Post #18





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



Interesting results on the performance. I wonder why gcc does so badly with 64 bit code?
QUOTE (Gabriel @ Aug 9 2006, 23:33) *
x87, mmx and even 3dnow are still there.
(see "AMD64 Architecture Programmerís Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions")
From Volume 1:
QUOTE
x87 floating-point instructions can be executed in any of the architectureís operating modes. Existing x87 binary programs run in legacy and compatibility modes without modification.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
kjoonlee
post Aug 10 2006, 12:46
Post #19





Group: Members
Posts: 2526
Joined: 25-July 02
From: South Korea
Member No.: 2782



I wanted to post about this, but I didn't because I didn't have any benchmark numbers to back up my claims.

One of the first things I did when I got an AMD64 box was to test ~/bin/lame, /usr/bin/oggenc, etc etc.

For me, both lame and oggenc were faster as a 64bit build than a 32bit build, IIRC.

edit: And yes, I used gcc. Can't remember which version.

This post has been edited by kjoonlee: Aug 10 2006, 12:48


--------------------
http://blacksun.ivyro.net/vorbis/vorbisfaq.htm
Go to the top of the page
+Quote Post
Agitator
post Sep 17 2006, 13:34
Post #20





Group: Members
Posts: 61
Joined: 11-December 03
From: Seattle, WA
Member No.: 10359



QUOTE (Gabriel @ Aug 9 2006, 13:17) *
I uploaded VC8 32 and 64 builds there:
http://gabriel.mp3-tech.org/lame/x64/

Strange thing is that enabling sse2 decreases speed of the x86 compilation.

Hi.

I just tested lame32 and lame64 on my Athlon64 3700+, Windows XP x64 SP1.
LAME 32bits version 3.98 (alpha 6, Jul 30 2006 11:33:12)
LAME 64bits version 3.98 (alpha 6, Aug 9 2006 22:37:22)

Track tested: David Bowie - The Man Who Sold The World (4:00)
32-bit, -V 0 --vbr-new: 22secs, 8 745 963 bytes
64-bit, -V 0 --vbr-new: 19secs, 8 746 172 bytes
32-bit, -V 6 --vbr-new: 20secs, 4 541 454 bytes
64-bit, -V 6 --vbr-new: 16secs, 4 541 243 bytes
32-bit, -b 320: 23 secs, 9 602 975 bytes
64-bit, -b 320: 20 secs, 9 602 975 bytes

64-bit is about 16% faster than 32-bit, but how come there is difference in file sizes when using VBR? I haven't ABX'ed the resulting files, so I don't know if the difference is audible.


--------------------
/Agitator
Go to the top of the page
+Quote Post
cabbagerat
post Sep 17 2006, 14:00
Post #21





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



QUOTE (Agitator @ Sep 17 2006, 04:34) *
64-bit is about 16% faster than 32-bit, but how come there is difference in file sizes when using VBR? I haven't ABX'ed the resulting files, so I don't know if the difference is audible.
That is fairly interesting. The reason is that the different compilers seem to emit floating point/SSE calculations in different orders. You might know this already, but with floating point the order of the calculations can make a big difference. For example a*b+a*c is very seldom the same as a*(b+c).

An ABX test would be interesting. I got a null result from a test that I did with lame compiled with different versions of GCC earlier this year.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
punkrockdude
post Mar 7 2010, 23:17
Post #22





Group: Members
Posts: 256
Joined: 21-February 05
Member No.: 20022



Anyone who has time and interest in making x64 LAME v3.98.3 compiles. Both .exe and dll? I am wondering because some x64 programs I use need to have the lame.dll in its folder and if it is a x86 compile it won't work. Reaper for example. Regards
Go to the top of the page
+Quote Post
punkrockdude
post Mar 7 2010, 23:20
Post #23





Group: Members
Posts: 256
Joined: 21-February 05
Member No.: 20022



This is in the change log of Reaper 3.35:
x64: now requires libmp3lame.dll or lame_enc64.dll (old x64 lame_enc.dll was broken)
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd October 2014 - 12:26