Author: Robert Hyatt
Date: 10:30:45 09/10/03
Go up one level in this thread
On September 10, 2003 at 10:11:05, Dave Gomboc wrote: >On September 09, 2003 at 18:33:22, Steven Edwards wrote: > >>Greetings: >> >>It was some ten years ago back in 1994 that the CDS (Chess Data Standards: SAN, >>FEN, EPD, and PGN) took their current form after months of newsgroup discussions >>and megabytes of email exchange. Prior to the adoption of the CDS, the exchange >>of chess data and the interoperatabilty of chess softawre was greatly impeded by >>the lack of standards and the prevalence of secret and non-free proprietary >>formats. >> >>Although the CDS are now used by (nearly) all archives and all programs, it does >>not mean that the current form of the standards is the best. Certainly there >>are places where extensions can be made and where unneeded items deprecated and >>eventually deleted. To make a revision work, there is a requirement for input >>from the experienced chess programmer community just as there was such a need >>ten years ago. Therefore, I once again issue the call for online (here in CCC) >>and email discussion on the CDS topic. (Including "CDS" in the subject line >>helps in this process.) >> >>Here are some of my ideas: >> >>SAN: Looks fine; I don't see any need for changes. There is the possibility to >>extend the piece identfier letter semantics to accommodate heterodox / >>unorthodox chess. > >The use of FAN (figurine algebraic notation) would be more forward-looking. I'm >not going to use KQRBNP forever, given there are glyphs in Unicode for the chess >pieces. Switching would also obviate the need for language translations. > >>FEN: One deficiency exists for which I take responsibility; the en passant >>target square semantics should indicate a non null value only if an active >>color pawn attacks the passed-over target square. A change here will >>improve position database operation and can also have a positive effect >>on internal transposition management. Also, there is the possibility of >>extending the castling availability semantics to accommodate heterodox / >>unorthodox chess. > >Yes, people regularly ignore the spec when it comes to the e.p. square. > >>EPD: There are several items of concern: >> >>1. Currently, the first four EPD fields match the first four FEN fields. This >>was done to save space with the idea that the "hmvc" and "fmvn" EPD operations >>could provide the extra information if needed. This is rather arbitrary and I >>suggest that every EPD record have the same first six fields as a does a FEN >>record. Alternatively, EPD opcodes can be defined for the (current) first four >>EPD fields and so there would be NO required fields at the start of an EPD >>record. > >Either solution is better than the status quo. > >>2. Representations of string/symbolic data in operands is inconsistent with >>respect to the need for quoting. This is also the case with PGN tag values. A >>uniform rule is needed for both. >> >>3. Representations of time and date value operands needs to be formalized along >>with a provision for sub-second decimal resolution. >> >>4. The centipawn evaluation operand type needs a mate score indication >>correction. >> >>5. The centipawn evaluation operand type probably needs to be deprecated and >>replaced with a pawn evaluation operand type with a provision for sub-pawn >>decimal resolution. > >This should have used a real value to begin with. > >>6. The time control operand type needs to be extended and formalized. PGN has a >>problem with this as well. >> >>7. A formal XML schema could be useful. Likewise for PGN. > >The current syntax is not compatible with XML. You could consider heading an >effort to make a replacement standard interchange format that used XML. > >>8. Removal of record length limitations. >> >>9. Explicit support for 64 bit integer values (decimal and hexadecimal) as >>operands. >> >>10. Inclusion of progam-to-program comand protocol opcodes. >> >> >>PGN: Again, several areas need re-examination. >> >>1. Adding the Broket Form to the movetext. A broket form is a single EPD >>operation delimited by angle brackets ("brokets"). This is a far better >>approach to embedding metadata than the cuurent use of comments. >> >>2. Deprecation of the use of a period of each White move number. The use of a >>period here has little, if any, need and just consumes space. > >Lots of things consume space, but humans expect the period to be there. Don't >mess with it. If we were looking for an efficient, compact representation, we >wouldn't be using text to begin with. Here I agree. Otherwise let's dump SAN as well. We define a simple canonical move generation order, and then just store each move as a one byte index into that list. A 60 move game would require 120 bytes, total. But it won't be readable by humans. PGN is basically machine _and_ human readable. > >>3. Removal of all mention of "canonical representation". It was an attempt to >>support matching PGN movetexts based on simple string comparisons. Unneeded. >> >>4. Formalization of the PGN tag name set, including any PGN tag names that have >>become popular "in the wild" and deprecating those which are rarely, if ever, >>used. >> >>5. Formalization of PGN tag value semantics. Part of this includes the use of >>"*" to indicate an unknown value, just as it already does for a game result. > >I should point out here that most data uses semi-colon to separate two values, >though PGN indicates a colon should be used. > >>6. Removal of the binary representation standard. This is unneeded as the use >>of fast and portable text compression tools is now commonplace. >> >>7. Adding some kind of formal way of representing attributes for aggregates af >>PGN game data. > >You've missed a category, and the one that is most contentious: NAG (numeric >annotation glyphs). > >I think the design erred in the way that NAGs were assigned meaning. This can >be evidenced by the large amount of data that uses NAGs in Informant style, e.g. >"white has a space advantage" is used irrespective of whose turn it is. In >practice, the code for this is defined as "space advantage", and the "black has >a space advantage" might as well not exist. The player whom space advantage >refers to is invariably the player of the last move before the NAG occurs. > >In the below, "nobody" is of course probably not true, there's likely SOMEONE >who uses it, but what I mean is that widespread practice is otherwise. > >For instance: >11 Equal chances, quiet position (=) >12 Equal chances, active position (=) > >Nobody uses $12. $11 is invariably used to refer to any position assessed as >equal. > >7 Forced move >8 Singular move; no reasonable alternatives > >Nobody uses $7. > >14 White has a slight advantage (+=) >15 Black has a slight advantage (=+) >16 White has a moderate advantage (+/-) >17 Black has a moderate advantage (-/+) >18 White has a decisive advantage (+-) >19 Black has a decisive advantage (-+) >20 White has a crushing advantage (+-) >21 Black has a crushing advantage (-+) > >Nobody uses $20 or $21. > >Better would have been to assign codes for: white, black, slight advantage, >clear advantage, decisive advantage. That's not obvious from the above, but >here we see: > >24 White has a slight space advantage >25 Black has a slight space advantage >26 White has a moderate space advantage >27 Black has a moderate space advantage >28 White has a decisive space advantage >29 Black has a decisive space advantage >30 White has a slight time (development) advantage >31 Black has a slight time (development) advantage >32 White has a moderate time (development) advantage >33 Black has a moderate time (development) advantage >34 White has a decisive time (development) advantage >35 Black has a decisive time (development) advantage >36 White has the initiative >37 Black has the initiative >38 White has a lasting initiative >39 Black has a lasting initiative >40 White has the attack >41 Black has the attack > >And so on. Nobody uses the black version of these, just the white. >Furthermore, nobody uses 24, 28, 30, 34, or 38 either, because they don't map to >Chess Informant codes, which is what people are used to. In practice, $40 means >"with the attack" whether the NAG spec says so or not. > > >You should also check http://scid.sourceforge.net/help/NAGs.html, which contains >the following extensions to represent common symbols: >140 With the idea ... >141 Aimed against ... >142 Better move >143 Worse move >144 Equivalent move >145 Editor's Remark ("RR") >146 Novelty ("N") >147 Weak point >148 Endgame >149 Line >150 Diagonal >and so on, see the URL for the complete list. > >There was also some extension to PGN made for clock timing: >http://www.enpassant.dk/chess/palview/enhancedpgn.htm > >Frankly, there are too many separate notations for everything. How many >acronyms does one need? :-) There's room to tweak the standard, and if that's >all you want to do, fine and well. But there are other things that make sense >to do that aren't supported at all currently, and would look like a hack with >our current notation. > >For instance, consider the record of a game where moves were missed up to the >time control. It should be possible to indicate that a series of moves are >missing, that the next board position is like so, and continue with a new move >numbering from there (e.g. it might jump by eight or something.) This would >also have allowed the representation of games where an illegal move occured >(arguably the most important case), representation of non-standard chess games >in PGN by introducing a board after funny-not-real-chess-castling occured, or >introducing a board after a piece was dropped onto the board (e.g. "crazy >house", like bughouse but for 2 people). This is not an ideal solution for >everything but it at least would have been an acceptable workaround. > >Anyway, my point is that tweaks are one thing, but if you want to do some XML >thing, though, then start fresh. I don't think the XML will be as >human-readable as PGN, though. Tweaking PGN, and using XML to create a database >interchange format are, I think, two separate tasks. > >Dave
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.