ag2at.txt The task is, essentially, to take a file in AG format, e.g. /web/ling/kieliteknologia/tutkimus/interact/AT/dialogi80ag.xml and transform it first to a file in AT format, e.g. dialogi80at.xml and then further to DiAT format, e.g. dialogi80.xml. A file in AG format is basically a chart: a list of edges of form "from anchor a to anchor b there is an edge with label c". The task is to massage the list back into a form where the edge text runs in the order determiner by the anchors and the tags are nested in some suitable way. For instance, if the chart says: "from anchor 0 to anchor 1 there is a FOO edge with text foo". "from anchor 1 to anchor 2 there is a BAR edge with text bar". "from anchor 2 to anchor 3 there is a BAZ edge with text baz". "from anchor 0 to anchor 2 there is a FOOBAR edge with text foo bar". "from anchor 1 to anchor 3 there is a BARBAZ edge with text bar baz". this should go into something like foo bar baz where the overlap of FOOBAR and BARBAZ edges is handled by splitting one of them into pieces which are coindexed with the "set" feature. The conversion is not going to be unique in that there are many ways to do the splits. The conversions should be conservative in the sense that iterating them does not lose information: the AG files which are generated in each back and forth conversion should be equivalent. (Whether they should be identical bears considering.) There are complications: what if there are alternative texts going on at the same time (e.g. speaker overlap)? But that only makes the problem interesting :).