Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: open/collaborative human chess database?

Author: Michael Yee

Date: 06:59:14 12/03/05

Go up one level in this thread


On December 03, 2005 at 00:43:46, Dann Corbit wrote:

>On December 03, 2005 at 00:30:18, Michael Yee wrote:
>
>>I don't think one currently exists. But since all games seem to be free (at
>>least for now), shouldn't it be possible?
>>
>>For starters, there's the old free SCID database (if anyone still has it),
>>Jose's database, TWIC's archives, chesslib's free database, Dann's ftp site,
>>britbase, norbase, etc. Convekta, Les Echecs, and other sites also have or had
>>free games available.
>>
>>Tasks/goals for managing the database might include:
>>
>>(1) have a consistent unambiguous system for player names, place names, etc.
>>(2) ensure accuracy of ratings (or add them if missing)
>>(3) ensure accuracy of moves (by checking multiple sources?)
>>(4) differentiate between classical, rapid, blitz, correspondence, internet,
>>etc.
>>
>>Note: I'm not opposed to buying a commercial database one of these days. But it
>>just seems funny/strange that they all seem to be selling essentially the same
>>free data (aside from analysis and other extras).
>
>I think you underestimate the quality control that goes into a commercial
>system.

You're probably right. I guess the header information is where most of the hard
(human) work comes in. And maybe the headers (not the moves themselves) are
precisely what could be copyrighted?

It still would be interesting (at least to me) to see what level of quality
could be achieved in a collaborative way. The issue of name disambiguation comes
up in other areas too, so maybe some automated tools could help.

Michael

>Take any freely avaiable database (such as junkbase) and count the player's
>names.  There are hundreds of thousands in a 4 million game file.  Of course,
>there aren't really that many.  It is due to dups and bad spelling and a host of
>other issues.
>
>Besides the features, the commercial systems are selling you quality.
>
>If you want to get a really good database from free sources, you will have to be
>really, really careful about where you get your games.





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.