Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Interesting statement from Ossi Weiner about Nunn test

Author: Bruce Moreland

Date: 10:21:49 09/18/98

Go up one level in this thread



On September 17, 1998 at 15:39:45, Moritz Berger wrote:

>Weiner: Nevertheless you must concede that not everything went smoothly at the
>beginning. I would have preferred if the CSS editors had sent the positions
>already early in 1997 to all important programmers. This would have killed this
>whole discussion right from the start. Meanwhile I now had an opportunity to
>talk about this with Frans Morsch. He assured me that Fritz 5 has not been tuned
>on the Nunn test. That ends this story for me in a satisfactory way.

I think that in this context, the Nunn test is a very bad idea.

My understanding of how it works is as follows.  There is a series of ten
positions, and you play twenty games between A and B, so that both A and B get
to be white from each position once.

So you have a twenty-game match that discounts opening book.  I disagree that
programs should be tested without opening book, but I am willing to ignore this
for purposes of discussion.

My objection to this test is that it is a very limited, closed test.  You get
twenty games.  It is impossible to get more than twenty games.  If someone else
does the test on the same hardware, they should get the same results exactly,
rather than getting results that would tend to augment the significance of tests
done by someone else.  What I mean by this is that if you do the test twice,
rather than creating additional evidence that A is indeed stronger than B, all
you do is verify that the person who conducted the test the first time didn't
mess up.

I think that a 20-game match will usually be too short to achieve a valid
comparison between two programs.

I think that the inability to run a 50- or 100- or 1000- game Nunn match is a
disadvantage inherent in the test.

And publication of the positions, so that future programs can be tuned for them,
is insanity, if people are really going to take this stuff seriously.  We will
have people tuning for the tests.

The idea of these guys sitting around trying to figure out how to jimmy up
essentially random numbers, in such a way as to produce an ending that a program
doesn't understand, but wins, against another program, in order to increase a
Nunn match score, is disgusting.

bruce



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.