LingML Overview

The OTML project is part of a general project devoted to developing logical encodings of theoretical objects in linguistic theory in XML. These encodings can then be rendered with standard formatting tools (HTML, LaTeX, PDF, etc.) or be subject to further computational processing.

The idea is to liberate the linguist from the tedium of formatting, while at the same time, tying our linguistic diagrams and formalisms to clear testable claims.

If you are interested in participating in this enterprise or would like to contribute resources for any theoretical framework within linguistic theory, see the LingML homepage.

PSML Overview

The XML resources on this page are intended to demonstrate the LingML approach with respect to phrase-structure grammars.

Encoding Phrase-Structure Grammars

Phrase-structure grammars are encoded in XML using the psgrammar.dtd. There is also a sample XML file using this DTD.

The format of the grammar file is fairly simple. First, come a list of rules, where the first rule designates the root element, e.g. S. There can only be one rule for each element, but the (rightside) expansions can include optional elements.

The grammar file also includes a lexicon of words. So far, the lexicon only includes part-of-speech information.

Displaying a grammar

Grammars can be displayed with the psgrammarhtml.xsl stylesheet.

Checking/Validating a Sentence

One can display and a sentence parsed with the grammar. This is done in two steps. First, one creates a custom DTD file from the XML grammar file with the psgdtd.xsl stylesheet. A stample DTD file generated from the mygrammar.xml sample XML file is given in gtest.dtd file. (Remember, you do not generate this file yourself.)

Next, one creates an XML file with the sentence parsed according to the grammar. This file should invoke the DTD file you created and refer back to the XML grammar file. This file should also invoke the sentencehtml.xsl stylesheet. A sample file doing all this with respect to the mygrammar.xml file and the gtest.dtd file is given as mysentence.xml.

The sentencehtml.xsl stylesheet automatically checks that each terminal is in the lexicon with the proper part-of-speech. To check the validity of the tree structure, the document must be explicitly validated with respect to the DTD.

Printing a Tree

A tree structure can be printed out using the sentencelatex.xsl style sheet. This converts the XML sentence file into a LaTeX file which can then be processed into dvi or pdf. (Note that your LaTeX installation must include the optional free qtree package.)

Mike Hammond