Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Temporal Difference

Author: Rémi Coulom

Date: 02:23:53 01/07/01

On January 06, 2001 at 21:11:04, Bas Hamstra wrote:

>On January 06, 2001 at 15:08:11, Rémi Coulom wrote:
>
>>
>>I have no experience in using TD(lambda) for chess, but I know a little about
>>reinforcement learning and neural networks and what you describe looks like a
>>typical ill-conditionning problem. (I am currently using TD(lambda) to solve
>>control problems, but it works rather similarly). You can take a look at:
>>
>>ftp://ftp.sas.com/pub/neural/illcond/illcond.html
>>ftp://ftp.sas.com/pub/neural/FAQ2.html#A_illcond
>>
>>The lazy solution to this problem consists in tweaking coefficients for weights
>>in order to improve the condition number. The harder way is to use more advanced
>>learning algorithms than vanilla gradient descent (conjugate gradient, for
>>instance) as explained in the links above.
>>
>>A good reference for the theory of this kind of algorithms is
>>http://www.athenasc.com/ndpbook.html
>>
>>I hope this helps.
>>
>>Remi
>
>Thanks Remi. Not easy though, if you're not too familiar with the terminology.
>
>Bas.

That is true it is not very easy. If you are not familiar with the theory of
reinforcement learning, I would suggest:

"Reinforcement Learning: an introduction"
http://www-anw.cs.umass.edu/~rich/book/the-book.html
(Great introductory book readable online. One of the authors is Richard Sutton,
who first published about the TD(lambda) algorithm)
In particular, section 8 deals with function approximation:
http://www-anw.cs.umass.edu/~rich/book/8/node1.html

The neural network FAQ also has plenty of pointers to introductory material.
ftp://ftp.sas.com/pub/neural/FAQ.html
In particular, "An Introduction to the Conjugate Gradient Method Without the
Agonizing Pain"
http://www.cs.cmu.edu/~jrs/jrspapers.html

It might not be obvious to you, but what you are doing is exactly equivalent to
using a neural network. Indeed, from an abstract point of view, a neural network
is a function parameterized by a number of weights. Usually, when somenone
speaks about neural networks, it implies some form of feed-forward architecture,
multi-layer perceptron, etc. But the theory developped also works with any kind
of function, including your evaluation function in which you tune parameters.

Remi

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.