Author: Rémi Coulom
Date: 02:23:53 01/07/01
Go up one level in this thread
On January 06, 2001 at 21:11:04, Bas Hamstra wrote: >On January 06, 2001 at 15:08:11, Rémi Coulom wrote: > >> >>I have no experience in using TD(lambda) for chess, but I know a little about >>reinforcement learning and neural networks and what you describe looks like a >>typical ill-conditionning problem. (I am currently using TD(lambda) to solve >>control problems, but it works rather similarly). You can take a look at: >> >>ftp://ftp.sas.com/pub/neural/illcond/illcond.html >>ftp://ftp.sas.com/pub/neural/FAQ2.html#A_illcond >> >>The lazy solution to this problem consists in tweaking coefficients for weights >>in order to improve the condition number. The harder way is to use more advanced >>learning algorithms than vanilla gradient descent (conjugate gradient, for >>instance) as explained in the links above. >> >>A good reference for the theory of this kind of algorithms is >>http://www.athenasc.com/ndpbook.html >> >>I hope this helps. >> >>Remi > >Thanks Remi. Not easy though, if you're not too familiar with the terminology. > >Bas. That is true it is not very easy. If you are not familiar with the theory of reinforcement learning, I would suggest: "Reinforcement Learning: an introduction" http://www-anw.cs.umass.edu/~rich/book/the-book.html (Great introductory book readable online. One of the authors is Richard Sutton, who first published about the TD(lambda) algorithm) In particular, section 8 deals with function approximation: http://www-anw.cs.umass.edu/~rich/book/8/node1.html The neural network FAQ also has plenty of pointers to introductory material. ftp://ftp.sas.com/pub/neural/FAQ.html In particular, "An Introduction to the Conjugate Gradient Method Without the Agonizing Pain" http://www.cs.cmu.edu/~jrs/jrspapers.html It might not be obvious to you, but what you are doing is exactly equivalent to using a neural network. Indeed, from an abstract point of view, a neural network is a function parameterized by a number of weights. Usually, when somenone speaks about neural networks, it implies some form of feed-forward architecture, multi-layer perceptron, etc. But the theory developped also works with any kind of function, including your evaluation function in which you tune parameters. Remi
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.