Teaching Natural Language Generation in an XML Framework
--------------------------------------------------------
Graham Wilcock
University of Helsinki
00014 Helsinki, Finland
graham.wilcock@helsinki.fi
1. Introduction
XML-based techniques for natural language generation are described by
Wilcock (2001), based on practical experience in developing an
XML-based generation component for a spoken dialogue system (Jokinen
and Wilcock, 2001). The basic approach is to construct a pipeline of
XSLT transformations corresponding to the different NLG processing
tasks. Wilcock (2002) gives a tutorial introduction to this approach
and to the basic NLG tasks. A related software demo showing NLG
integrated with XML web technology is described by Wilcock (2003).
Ongoing work aims at adapting this framework for use in teaching
natural language generation courses for language technology students.
The description here is largely taken from Wilcock (2003).
2. The Demonstration System
The demonstration system performs bilingual generation of responses,
in Finnish and English, as part of a Helsinki bus timetable enquiry
system. The responses depend on the dialogue context and can vary
from full sentences to short elliptical phrases. The system
demonstrates only generation, without speech recognition,
language understanding or dialogue management.
2.1 Input Agenda
The starting point is an agenda, a set of concepts marked with Topic
and NewInfo tags (Jokinen and Wilcock, 2001). A number of different
starting agendas are provided, and their contents can be changed as
desired.
routenew-info81timenew-info11:37placedeparttopicherttoniemenranta
Figure 1: An Agenda
The agenda is represented as an annotation graph. Figure 1 shows an
agenda for a response following the enquiry
When does the next bus leave from Herttoniemenranta?
The departure-place is marked as topic, and the route-number and
departure-time are marked as new information. The response (generated
step-by-step in the next sections) will be
Number 81 leaves from there at 11:37.
2.2 Text Planning
In text planning, the content determination stage extracts the
concepts from the annotation graph. The discourse structuring stage
creates a text plan tree (here called a response plan) using the form
of template-based generation described by Wilcock (2001).
NumFromDepMsgbus-number81departure-time11:37departure-placeherttoniemenranta
Figure 2: A Text Plan
Text plans are XML tree structures containing variable slots, filled
in later by the microplanning stages. In this example there is only
one message, typical in spoken dialogue responses. In multi-paragraph
text generation there are large numbers of messages. Note that
departure-place is Topic, bus-number and time are NewInfo.
In the teaching system, tracing can be switched on so the text plan is
displayed.
2.3 Microplanning
The microplanning stages are a sequence of XSLT transformations
(Wilcock, 2001). The text plan tree is replaced by a text
specification tree, here called a response specification.
At later stages of the pipeline, further information is added to the
tree or nodes in the tree are replaced by new nodes. In the referring
expression stage of microplanning, domain concepts are replaced with
linguistic referring expressions.
leave
number
81from-place
from
at-time
at
Figure 3: A Text Specification
In Figure 3 the concepts of Figure 2 have been replaced by linguistic
specifications. In the lexicalization stage, the
words are
inserted with their dependents, using a form of head-dependency
structure.
In the referring expressions stage, the departure-place concept which
was marked as Topic in Figure 2 has been pronominalized as there. If
the same departure-place concept were marked as NewInfo, it would be
realized by the actual text value of the departure-place.
In the teaching system, tracing can be switched on so the text
specification is displayed.
2.4 Realization
The realization stage produces Java Speech Markup Language.
number
81
leaves from there at
11:37
Figure 4: Speech Markup
The words of Figure 3 provide the main content. In JSML
marks sentence boundaries, tells the
speech synthesizer that "81" should be pronounced "eighty-one" not
"eight one". The JSML is passed to the FreeTTS speech synthesizer
which produces the spoken response, in this case
Number 81 leaves from there at 11:37.
References
K Jokinen and G Wilcock. 2001.
Confidence-based adaptivity in response generation
for a spoken dialogue system. SIGdial-2001, Aalborg, Denmark.
G Wilcock. 2001. Pipelines, templates and transformations: XML for
Natural Language Generation. 1st NLP and XML Workshop, Tokyo.
G Wilcock. 2002. XML-based Natural Language Generation. XML Finland
2002, Helsinki.
G Wilcock. 2003. Integrating Natural Language Generation with XML
Web Technology. EACL-2003, Budapest.