Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Pawn hashing without Zobrist keys

Author: Gerd Isenberg

Date: 06:19:18 09/14/03

Go up one level in this thread


On September 14, 2003 at 07:06:19, Vincent Diepeveen wrote:

>On September 14, 2003 at 04:48:48, Gerd Isenberg wrote:
>
>><snip>
>>>>>I seriously doubt Gerd is aware how slow bitboards are and how difficult they
>>>>>are to use in complex software to improve for example ones evaluation function.
>>>>>
>>>>>Also i seriously doubt Gerd knows the penalty for % at the opteron.
>>>>>
>>>>>Regards,
>>>>>Vincent
>>>>
>>>>Vincent! You are playing games with me :-)
>>>>I seriously doubt thet you read my initial post.
>>>>
>>>>About opterons %:
>>>>
>>>>Athlon
>>>>DIV  mreg16/32    VectorPath 24/40
>>>>MUL  mreg32       VectorPath 6
>>>>
>>>>Opteron
>>>>DIV  mreg16/32/64 VectorPath 23/39/71
>>>>MUL  mreg64       Double!    5!
>>>>MUL  mreg32       Double!    3!
>>>
>>>Yes i saw your post on this. No i am not ignoring you.
>>>I just post less to nonsense threads and i don't have time>>>to read all postings at CCC, as a person needs to work in life too
>>>as you might very well know, that's all.
>>>
>>>Read the opteron manual again.
>>>
>>>It's 78 cycles for DIV if you measure at opteron. That's what opposing hardware
>>>engineers post as being the penalty for DIV at opteron.
>>>
>>>You can so to speak XOR the entire board for that.
>>>
>>>I am amazed that you are toying with this then.
>>>
>>>All discussions closed about the topic. Period.
>>>
>>
>>Hmm... my post was about avoiding mod, but doing four 32-bit muls instead.
>>On opteron one may use two 64-bit muls for that purpose, like this:
>>
>>    mov  rax, 0x00014FAC00053EB1  ; (((2^64)+49981-1)/49981)
>>    mov  rcx, [rbx.m_PawnBB + 0]  ;  white pawns
>>    sub  rcx, [rbx.m_PawnBB + 8]  ; -black pawns
>>    mov  rbx, 49981
>>    mul  rcx
>>    mov  rax, rbx
>>    mul  rdx
>
>What the hell is in RDX and why just 1 instruction between the 2 multiplies
>instead of 2?

rax * rcx = rdx:rax the upper 64-bit of the 64*64 bit multiplication.
That's the reason i used these assembly instructions.
With MSC one has to use a special 64*64=128-bit intrinsic.

>
>>    sub  rcx, rax
>>    mov  rax, rcx
>>    sar  rcx, 63    ; extend sign bit to mask
>>    and  rcx, rbx   ; if ( modulus < 0 )
>>    add  rax, rcx   ;    modulus += 49981;
>
>Never program in assembly, unless you get paid fulltime to do so, waste of your
>time otherwise, you're whole your life busy porting to the different processors
>then.
>
>Note that your assembly programming is still too x86 oriented. You're just using
>4 register here.


I used assembly here for didactical reasons ;-),
may be better to use some pseudo code.


>
>Also i'm sure you don't want to write itanium assembly. Writing in assembly for
>non-OS programs is near to impossible for that thing.
>
>>Not sure whether zobrist key hashing outperforms this routine,
>
>all the compiler gotta do internal is for opteron:
>
> register RIX = load quadword _PawnHash
>
>And i got my pawn hash at once, you need 2 references. Thanks to having a small
>program, yours is in L1 perhaps and mine might not be (i assume it is though).
>
>>specially if it is embedded inside some independent other stuff.
>>Anyway, it is negligible. As i already mentioned, i don't probe pawn hash,
>>if i got a hit from eval hash or from main transposition table.
>
>i get 17.7% out of pawnhash and near the leaves the transpositiontable is like a
>few%, so the chance is like 80% it has to eval a leaf so to speak.
>
>>>In makemove you have to XOR the pawns with the hashkey anyway with the position
>>>hashkey in order to use that for transposition.
>>>
>>>Instead of XORing that key you can of course use 2 keys.


I understand it. But my node structure with hashkey and all the incremental
updated bitboards is currently very well aligned. Adding an additional 32-or
64-bit value for an additional pawnhashkey blows it up over some 32- or 64-byte
boundary...

Anyway for captures, didn't one use one condition if a pawn is capture target?
Or do you use a second zobrist key table where all pieces have zero values?

In my next approach, i'm not quite sure whether i will use pawn hash tables at
all. The hashkey function i mentioned, makes it easier to use it on the fly.


>>>
>>>1 for pieces. 1 for pawns. To combine them for transposition table
>>>
>>>hashkeyhi = hashpawn.hi^hashkey.hi;
>>>hashkeylo = hashpawn.lo^hashkey.lo;
>>>
>>>
>>>I figured that out in 1995 already for DIEP.
>>>
>>
>>Great - i did't got the idea with zobrist keys for pawn hashing.
>>In about 93 or 94 iirc, when implementing pawn hash tables the first time,
>>it was "obvious" to me to generate an index by some calculations from pawn
>>bitboards.
>
>Don't tell me that you're doing:
>  (whitepawns-blackpawns)^piecehashing
>
>Better start measuring at a few billion nodes how many bad scores you get back
>and how many collisions in total.
>
>Note that i'm going to do that test too for diep at the supercomputer real soon
>again. I'm not 100% sure that 64 bits is enough considering the NPS i get and a
>50+ GB hashtable.
>
>>
>>>One 2 years ago i also tried to measure the difference at K7 and P3 between
>>>storing the hashkey in 1 data element of 64 bits versus
>>>  struct {
>>>    unsigned int lo;
>>>    unsigned int hi;
>>>  }
>>>
>>>It was *significant* faster to store it in 2 x 32 bits.
>>>
>>
>>Aha, but isn't a __int64 handled in that way internally on 32-bit machines?
>
>So you didn't even try the test.
>
>Better start testing *now*.


I do inspect assembler output and found nothing strange with MSC __int64.
Very similar to your lo-ho-struct.


>
>From my comments you can deduce of course it isn't. Don't ask me why, i didn't
>even look to the assembly. It simply was *a lot* slower.
>
>>>Perhaps time for a new experiment?
>>>
>>>Just measure with grown up compilers like visual c++, gcc.
>>>
>>>And when it releases intel c++ 8.0 i will give another shot, the
>>>current 7.1 they find bug after bug at itanium2. Internal compiler errors in
>>>fact even.
>>>
>>>"Aster experiences
>>>   Some users have come across a number of compiler bugs. In those case the
>>>compilation terminates with a message reporting an internal compiler error.
>>>Currently version 7.1 of the intel compilers is installed on Aster. We were
>>>informed by SGI..."
>>>
>>>From: Newsletter SARA #3 september 2003
>>>
>>>Perhaps also visible at sara.nl, not sure those stuff they only got on paper.
>>>"don't disturb the outside world with the bad news" i bet is the idea.
>>>
>>>Well intel c++ doesn't even run parallel, so diep won't crash at world champs
>>>thanks to intel c++ compiler bugs, don't worry.
>>>
>>>Anyway, i am no longer amazed after the last few months that your projects do
>>>not have any objective that is similar to mine.
>>>
>>>See you at the world champs!
>>>
>>
>>No - i do not attend there.
>
>You write in assembly for opteron even before having one, and you don't even
>show up there?
>

It's not about opteron.
It's about an affront.

>Oh well i assume you have other obligations. Catch you in Paderborn 2004 then!
>

Ok, see you

Gerd

<snip>



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.