Friday, August 24, 2012

More on SHORTREF

As I mentioned earlier, I had two "joke" DTDs to demonstrate the use of SHORTREF.  Without further ado, here is the second:

<!-- DTD for UNIX /etc/passwd file: see Notes at end -->

<!ENTITY % acct "user,pwd,uid,gid,gcos,home,shell" >

<!ELEMENT passwd   o o  (account)+ >
<!ELEMENT account  - o  (%acct)    >
<!ELEMENT (%acct)  - o  (#PCDATA)  >

<!-- map colons and line boundaries to appropriate tags  --
  -- cascade of maps to avoid problems with empty fields -->

<!ENTITY   start    "<account><user>"   >
<!SHORTREF passmap  "&#RS;"    start    >
<!USEMAP   passmap             passwd   >

<!ENTITY   s.pwd    STARTTAG   "pwd"    >
<!SHORTREF usermap  ":"        s.pwd    >
<!USEMAP   usermap             user     >

<!ENTITY   s.uid    STARTTAG   "uid"    >
<!SHORTREF pwdmap   ":"        s.uid    >
<!USEMAP   pwdmap              pwd      >

<!ENTITY   s.gid    STARTTAG   "gid"    >
<!SHORTREF uidmap   ":"        s.gid    >
<!USEMAP   uidmap              uid      >

<!ENTITY   s.gcos   STARTTAG   "gcos"   >
<!SHORTREF gidmap   ":"        s.gcos   >
<!USEMAP   gidmap              gid      >

<!ENTITY   s.home   STARTTAG   "home"   >
<!SHORTREF gcosmap  ":"        s.home   >
<!USEMAP   gcosmap             gcos     >

<!ENTITY   s.shell  STARTTAG   "shell"  >
<!SHORTREF homemap  ":"        s.shell  >
<!USEMAP   homemap             home     >

<!ENTITY   end      ENDTAG    "account" >
<!SHORTREF shellmap "&#RE;"   end       >
<!USEMAP   shellmap           shell     >

<!-- NOTES
  -- --
  My first attempt tried to get away with a generic mapping of ":"
  to minimized end tags like this:

        <!ENTITY   start    STARTTAG  "account" >
        <!SHORTREF passmap  "&#RS;"    start    >
        <!USEMAP   passmap             passwd   >

        <!ENTITY   end      ENDTAG    "account" >
        <!ENTITY   delim    ENDTAG    ""        >
        <!SHORTREF acctmap  "&#RE;"   end
                            ":"       delim     >
        <!USEMAP   acctmap            account   >

  This works only so long as account lines don't have empty fields.
  However, a number of system accounts (bin, sys, etc.) don't have
  SHELLs assigned. The problem here is Clause 7.3.1.1:

     The start-tag can be omitted if the element is a contextually
     required element and any other elements that could occur are
     contextually optional elements, except if
       a) the element has a required attribute or declared content;
       or
       b) the content of the instance of the element is empty.

  Hence the series of shortrefs mapping ":" to various start-tags.
  -- --
  Copyright 1994-8  Arjun Ray
-->

This was also an attempt to make an SGML instance out of /etc/passwd, using a sequence of shortref maps, each triggered by the current element context. The DTD also uses another feature of entity declarations, where the replacement text can be identified directly as markup so that the parser doesn't even have to go looking for tags.

But SHORTREF isn't only for jokes.  As shorthand for otherwise verbose markup, it helps authors, albeit only those who are familiar with the DTD to which the document they're composing will be expected to conform.  The approach taken in this article goes one step further and considers the possibility of short references constituting an alternate syntax, and demonstrates the viability of this through RNG validation.

Unfortunately, this will not go down well with tag-heads, those who can't sleep at night without their daily diet of pointy brackets.  Anything that threatens to take their beloved tags away would be anathema.

No comments: