Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: assembly--not really that fast

Author: Ed Schröder

Date: 13:54:41 01/14/02

Go up one level in this thread


On January 14, 2002 at 13:38:06, Eugene Nalimov wrote:

>On January 14, 2002 at 04:16:54, Ed Schröder wrote:
>
>>On January 13, 2002 at 23:36:19, Eugene Nalimov wrote:
>>
>>>Can you please send me the function that was so badly compiled (probably via
>>>e-mail)? I'd like to find where VC screwed up. It's too late to fix it for VC7,
>>>but probably we can do it for VC7.x.
>>>
>>>Eugene
>>
>>
>>Screwed up is a big word, ASM being being just 30% faster than C is a very good
>>performance I would say. By head I remember the following cases:
>>
>>#1. a=b; c=d;
>>
>>The compiler will output something like:
>>
>>mov  EAX,b
>>mov  a,EAX
>>mov  EAX,d
>>mov  c,EAX
>>
>>Wheras it should generate:
>>
>>mov  EAX,b
>>mov  EBX,d
>>mov  a,EAX
>>mov  c,EBX
>
>---- File c1.c:
>
>int a, b, c, d;
>
>void foo (void)
>{
>	a = b; c = d;
>}
>
>---- File c1.asm (compiled with "cl /Ox /Fa c1.c")
>
>[Some assembly stuff deleted]
>
>_foo	PROC NEAR
>; File c:\repro\c1.c
>; Line 5
>	mov	eax, DWORD PTR _b
>	mov	ecx, DWORD PTR _d
>	mov	DWORD PTR _a, eax
>	mov	DWORD PTR _c, ecx
>; Line 6
>	ret	0
>_foo	ENDP


That's good.

Very well, but can the compiler for instance recognize:

a = b; x=1; c = d;

and do a good pipe-line job too?

The combinations are endless of course.



>>#2. Always these unavoidable MOVSX and MOVZX instructions. No compiler can
>>optimize this because it is impossible, only the ASM programmer knows what it is
>>allowed under the circumstances.
>
>Sometimes you can use C casts to avoid those... But yes, here assembly
>programmer is definitely better.
>
>>#3. Register use, same story as (2). I for instance use EBP and even ESP when I
>>am short on registers.
>
>VC, of course, use EBP when it decides it's beneficial.
>
>>#4. "char" use in MSVC, for instance: char a1,a2,a3,a4,a5,a6,a7,a8;
>>
>>Will NOT produce the 8 characters as a sequential memory block. So in case I
>>want to zero the 8 bytes I will be forced to write 8 instructions. Some other
>>compilers do generate a sequential memory block so you can redefine a1 and a5 as
>>32-bit and with 2 instructions zero them. This is pretty crucial in a chess
>>program, at least in mine, also because I have to "stack" many stuff when going
>>one ply deeper in the tree or when climbing back.
>
>Never, never, do that on PIII and especially on P4. For the detailed explanation
>look, for example, at "Intel Pentium 4 and Intel Xeon Processor Optimization
>Reference Manual", Section 1-22 "Store Forwarding".

Sounds alarming, my program is polluted with these kind of juicy ASM tricks. Do
you think it is a problem in ASM code too? And maybe you have an URL of the
documentation by hand?

Ed


>Eugene
>
>>#5. Special stuff, no compiler is able to recognize as only the ASM programmer
>>knows. I recently posted an example how to use the "indirect jump" the processor
>>is offering you when for instance generating moves.
>>
>>So it is not about bugs, it is more why no compiler will be ever able to beat an
>>experienced ASM programmer. However I do think that there is space for
>>improvement in the (1) and (4) case, maybe even on (3).
>>
>>Ed
>>
>>
>>
>>
>>>On January 13, 2002 at 18:51:02, Ed Schröder wrote:
>>>
>>>>On January 13, 2002 at 16:29:21, Tom Kerrigan wrote:
>>>>
>>>>>On January 13, 2002 at 07:05:02, Ed Schröder wrote:
>>>>>
>>>>>>I have to disagree, I have a MSVC6 version of Rebel and it runs 30% slower than
>>>>>>the ASM version.
>>>>>
>>>>>What do you attribute this difference to? Is it simply not possible to write C
>>>>>that produces the same assembly as your hand-written code? Or do you take
>>>>>certain liberties in the C code (perhaps in the same of readability?) that's
>>>>>slowing things down?
>>>>>
>>>>>-Tom
>>>>
>>>>Just have a look at the ASM code MSVC6 produces, it often is bad stuff. By
>>>>re-writing (optimizing) this "bad ASM stuff" I got my +30%.
>>>>
>>>>One ambiguous remark, don't believe everthing commercials are telling you :)
>>>>
>>>>Ed



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.