Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: planning a SSE-optimized chess engine

Author: Aart J.C. Bik

Date: 14:06:17 01/13/05

Go up one level in this thread



>I guess your main focus with SSE/2/3 is more on float or double arithmetic
>rather than on integers, signed and unsigned chars, shorts, ints and even
>__int64 with rather unorthogonal instruction sets and a lot of special cases.


Hi Gerd,

Well, in the vectorizer’s defense, although the initial focus was indeed on
floating-point codes, also a lot of effort has been put into optimizing integer
codes. For instance, the following, not directly chess related, loop below

  unsigned char x[100], y[100];
  …
  int sum = 0;
  for (i = 0; i < 100; i++) {
    int temp = x[i]-y[i];
    if (temp < 0) temp = - temp;
    sum += temp;
  }

will automatically vectorize into code that exploits the “psadbw” idiom.

[C:/cmplr/temp] icl -nologo -Fa -QxP -Qunroll0 -c joho.c
joho.c
joho.c(14) : (col. 3) remark: LOOP WAS VECTORIZED.
[C:/cmplr/temp] cat joho.asm
....
L:      movdqa    xmm1, XMMWORD PTR _x[eax]
        psadbw    xmm1, XMMWORD PTR _y[eax]
        paddd     xmm0, xmm1
        add       eax, 16
        cmp       eax, 96
        jb        L

But you are absolutely right that compiler-generated code still has a long way
to go before it can get even close to the "crafty" implementations you have
shown me so far.

Thanks for the feedback.
Aart





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.