Author: Will Singleton
Date: 05:12:43 07/31/02
Go up one level in this thread
On July 31, 2002 at 04:08:09, Rémi Coulom wrote: >I have just completed a PhD thesis on temporal difference learning (applied to >motor control, not games), and I also believe that this technique has not yet >been used to its full potential in computer chess. > >Knightcap/TDChess was an interesting experiment, but the strength of their >program was not high enough. I talked to some authors of stronger chess programs >that tried reinforcement learning. They told me it did not work well for them >(Franck Zibi, Pascal Tang, and maybe also Sylvain Renard, if I recall >correctly). I also remember Christophe Théron saying he does not believe it >could help to improve his program. So, this is not very encouraging. > >Nevertheless, I still believe that reinforcement learning can be applied >efficiently to computer chess. The key issue is that it requires more effort >than just implementing the simple algorithms that Baxter et al. describe, play a >few hundred training games and observe the result. Applying reinforcement >learning efficiently requires a deep understanding of theory, creativity in >selecting the right algorithm, a well-adapted evaluation-function architecture, >and _lots_ of training data. Tesauro's backammon player took months of CPU time >to learn to play. I have been running motor-control experiments for months of >CPU time as well, and my learners are still making new interesting discoveries. > >Also note that all book learning algorithms are reinforcement learning >algorithms, whether their authors know it or not. So, reinforcement learning has >already been applied successfully to high-level chess programs! > >Trying reinforcement learning in The Crazy Bishop is the first item in my list >of ideas to try. It is not likely to be very soon, though. It has been some >years already that others activities have had higher priorities for me. > >If you are curious, you can take a look at my thesis: >http://remi.coulom.free.fr/Thesis/ >This web page contains interactive demos of swimmers that learn to swim, and a >car driver that learns to drive. > >Rémi I love those swimmers, Dr. Coulom. :) I recall seeing a television show some time ago about research into self-propelled "fish" that learned to swim. Might have been a tuna. Seems like they were having some success, but I don't recall whether the motor-control program was the result of TD learning or not. In backgammon, as you know, a learning program would play hundreds of thousands of games in a short time span to tune the weights, then thousands of real games against other versions to test for improvements, ad infinitum. Given the huge amount of training and game-play needed, how could chess ever effectively exploit the technique? Will
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.