Author: Aart J.C. Bik
Date: 14:06:17 01/13/05
Go up one level in this thread
>I guess your main focus with SSE/2/3 is more on float or double arithmetic >rather than on integers, signed and unsigned chars, shorts, ints and even >__int64 with rather unorthogonal instruction sets and a lot of special cases. Hi Gerd, Well, in the vectorizer’s defense, although the initial focus was indeed on floating-point codes, also a lot of effort has been put into optimizing integer codes. For instance, the following, not directly chess related, loop below unsigned char x[100], y[100]; … int sum = 0; for (i = 0; i < 100; i++) { int temp = x[i]-y[i]; if (temp < 0) temp = - temp; sum += temp; } will automatically vectorize into code that exploits the “psadbw” idiom. [C:/cmplr/temp] icl -nologo -Fa -QxP -Qunroll0 -c joho.c joho.c joho.c(14) : (col. 3) remark: LOOP WAS VECTORIZED. [C:/cmplr/temp] cat joho.asm .... L: movdqa xmm1, XMMWORD PTR _x[eax] psadbw xmm1, XMMWORD PTR _y[eax] paddd xmm0, xmm1 add eax, 16 cmp eax, 96 jb L But you are absolutely right that compiler-generated code still has a long way to go before it can get even close to the "crafty" implementations you have shown me so far. Thanks for the feedback. Aart
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.