Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Hello from Edmonton (and on Temporal Differences)

Author: Will Singleton

Date: 05:12:43 07/31/02

On July 31, 2002 at 04:08:09, Rémi Coulom wrote:

>I have just completed a PhD thesis on temporal difference learning (applied to
>motor control, not games), and I also believe that this technique has not yet
>been used to its full potential in computer chess.
>
>Knightcap/TDChess was an interesting experiment, but the strength of their
>program was not high enough. I talked to some authors of stronger chess programs
>that tried reinforcement learning. They told me it did not work well for them
>(Franck Zibi, Pascal Tang, and maybe also Sylvain Renard, if I recall
>correctly). I also remember Christophe Théron saying he does not believe it
>could help to improve his program. So, this is not very encouraging.
>
>Nevertheless, I still believe that reinforcement learning can be applied
>efficiently to computer chess. The key issue is that it requires more effort
>than just implementing the simple algorithms that Baxter et al. describe, play a
>few hundred training games and observe the result. Applying reinforcement
>learning efficiently requires a deep understanding of theory, creativity in
>selecting the right algorithm, a well-adapted evaluation-function architecture,
>and _lots_ of training data. Tesauro's backammon player took months of CPU time
>to learn to play. I have been running motor-control experiments for months of
>CPU time as well, and my learners are still making new interesting discoveries.
>
>Also note that all book learning algorithms are reinforcement learning
>algorithms, whether their authors know it or not. So, reinforcement learning has
>already been applied successfully to high-level chess programs!
>
>Trying reinforcement learning in The Crazy Bishop is the first item in my list
>of ideas to try. It is not likely to be very soon, though. It has been some
>years already that others activities have had higher priorities for me.
>
>If you are curious, you can take a look at my thesis:
>http://remi.coulom.free.fr/Thesis/
>This web page contains interactive demos of swimmers that learn to swim, and a
>car driver that learns to drive.
>
>Rémi

I love those swimmers, Dr. Coulom. :)  I recall seeing a television show some
time ago about research into self-propelled "fish" that learned to swim.  Might
have been a tuna.  Seems like they were having some success, but I don't recall
whether the motor-control program was the result of TD learning or not.

In backgammon, as you know, a learning program would play hundreds of thousands
of games in a short time span to tune the weights, then thousands of real games
against other versions to test for improvements, ad infinitum.  Given the huge
amount of training and game-play needed, how could chess ever effectively
exploit the technique?

Will

Re: Hello from Edmonton (and on Temporal Differences) Rémi Coulom 08:50:14 07/31/02

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.