Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: bsf in amd64

Author: Gerd Isenberg

Date: 03:56:09 12/02/05

Go up one level in this thread


On December 01, 2005 at 17:18:41, Mridul Muralidharan wrote:

>On December 01, 2005 at 16:46:01, Zappa wrote:
>
>>On December 01, 2005 at 16:45:22, Mridul Muralidharan wrote:
>>
>>>Hi,
>>>
>>>  Just got the express ver of vc++ 2005 and it barfs on my inline assembly code
>>>(which uses bsf , etc) when I try to compile it for amd 64.
>>>
>>>Might be asking for too much , but just wondering if anyone has already ported
>>>leadz/trailz for it in an efficient manner (I do have some fallback non-asm code
>>>, but i suspect it is way too slow).
>>>
>>>Thanks,
>>>Mridul
>>
>>There is an intrinsic for it, but I forget what its called.
>>
>>anthony
>
>
>Thanks , totally forgot about intrinsic's :)
>Will revisit them later if required - for now , port.
>
>- Mridul

Hi Mridul,

yes, you'll find x64 intrinsics here:
http://msdn2.microsoft.com/library/azcs88h2(en-US,VS.80).aspx

#include <intrin.h>

extern "C"
unsigned char_BitScanForward64(
   unsigned long * Index,
   unsigned __int64 Mask
);

#pragma intrinsic(_BitScanForward64)

Whether this is faster than

"Using de Bruijn Sequences to index a 1 in a Computer Word"
http://supertech.csail.mit.edu/papers/debruijn.pdf

or Matt Taylor's folded DeBruijn trick with 32-bit mul may depend on a lot of
circumstances in a concrete chess program.

Clearly bsf reg64,reg64 is the shortest code, but 9 cycles vector path.
The multiplication with a DeBruijn constant is only 3/4 cycles direct path, but
requires a memory lookup...

Cheers,
Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.