Computer Chess Club Archives


Search

Terms

Messages

Subject: Ten years later: revising EPD/FEN/PGN

Author: Steven Edwards

Date: 15:33:22 09/09/03


Greetings:

It was some ten years ago back in 1994 that the CDS (Chess Data Standards: SAN,
FEN, EPD, and PGN) took their current form after months of newsgroup discussions
and megabytes of email exchange.  Prior to the adoption of the CDS, the exchange
of chess data and the interoperatabilty of chess softawre was greatly impeded by
the lack of standards and the prevalence of secret and non-free proprietary
formats.

Although the CDS are now used by (nearly) all archives and all programs, it does
not mean that the current form of the standards is the best.  Certainly there
are places where extensions can be made and where unneeded items deprecated and
eventually deleted.  To make a revision work, there is a requirement for input
from the experienced chess programmer community just as there was such a need
ten years ago.  Therefore, I once again issue the call for online (here in CCC)
and email discussion on the CDS topic. (Including "CDS" in the subject line
helps in this process.)

Here are some of my ideas:

SAN: Looks fine; I don't see any need for changes.  There is the possibility to
extend the piece identfier letter semantics to accommodate heterodox /
unorthodox chess.


FEN: One deficiency exists for which I take responsibility; the en passant
target square semantics should indicate a non null value only if an active color
pawn attacks the passed-over target square.  A change here will improve position
database operation and can also have a positive effect on internal transposition
management.  Also, there is the possibility of extending the castling
availability semantics to accommodate heterodox / unorthodox chess.


EPD: There are several items of concern:

1. Currently, the first four EPD fields match the first four FEN fields.  This
was done to save space with the idea that the "hmvc" and "fmvn" EPD operations
could provide the extra information if needed.  This is rather arbitrary and I
suggest that every EPD record have the same first six fields as a does a FEN
record.  Alternatively, EPD opcodes can be defined for the (current) first four
EPD fields and so there would be NO required fields at the start of an EPD
record.

2. Representations of string/symbolic data in operands is inconsistent with
respect to the need for quoting.  This is also the case with PGN tag values.  A
uniform rule is needed for both.

3. Representations of time and date value operands needs to be formalized along
with a provision for sub-second decimal resolution.

4. The centipawn evaluation operand type needs a mate score indication
correction.

5. The centipawn evaluation  operand type probably needs to be deprecated and
replaced with a pawn evaluation operand type with a provision for sub-pawn
decimal resolution.

6. The time control operand type needs to be extended and formalized.  PGN has a
problem with this as well.

7. A formal XML schema could be useful.  Likewise for PGN.

8. Removal of record length limitations.

9. Explicit support for 64 bit integer values (decimal and hexadecimal) as
operands.

10. Inclusion of progam-to-program comand protocol opcodes.


PGN:  Again, several areas need re-examination.

1. Adding the Broket Form to the movetext.  A broket form is a single EPD
operation delimited by angle brackets ("brokets").  This is a far better
approach to embedding metadata than the cuurent use of comments.

2. Deprecation of the use of a period of each White move number.  The use of a
period here has little, if any, need and just consumes space.

3. Removal of all mention of "canonical representation".  It was an attempt to
support matching PGN movetexts based on simple string comparisons.  Unneeded.

4. Formalization of the PGN tag name set, including any PGN tag names that have
become popular "in the wild" and deprecating those which are rarely, if ever,
used.

5. Formalization of PGN tag value semantics.  Part of this includes the use of
"*" to indicate an unknown value, just as it already does for a game result.

6. Removal of the binary representation standard.  This is unneeded as the use
of fast and portable text compression tools is now commonplace.

7. Adding some kind of formal way of representing attributes for aggregates af
PGN game data.



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.