Author: Gareth McCaughan
Date: 13:11:09 07/07/00
Go up one level in this thread
Yesterday I wrote: > To understand what lambda's for, you need to understand > the point of the TD business. The idea is that you adjust > your parameters not to make the evaluation close to some > pre-defined target, but to make the evaluation not change > much from one move to the next. The idea is that if the > game is played perfectly, each position is exactly as good > as the one that follows it. (This is badly oversimplified, > by the way.) If you have an evaluation function that (1) > gives the right answer in "terminal" positions, and (2) > evaluates each position the same way as it evaluates the > position after a Very Good Player has made a move, then > you have a good evaluation function. I should have mentioned something important, which is that the same thing is true if (2') it evaluates each position the same way as it evaluates the position after *it* has made a move (using its own eval to choose what to do). So you can, in principle, use TD to learn by self-play. -- g
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.