This is the time table of the general research seminar in language technology. Presentations will be given primarily by doctoral students but also other researchers in language technology including external guests
Abstract: There are good reasons to study probabilistic parsing in the more general case where the trees are discontinuous, containing long-distance dependencies and dislocations that break the constraints of projective constituents or dependencies. But probabilistic models for discontinuous parsing are more challenging to design than for continuous parsing. For example, shift-reduce parsing does not have any principled way to compute probability of the parse.
We give the first model that defines plausible probability distribution over the space of all discontinuous parses. A latent-variable probabilistic grammars is generalized directly to discontinuous parsing. The chosen formalization covers both constituent parsing and dependency parsing and can be used for practical probabilistic reranking of parses.
The Information Dynamics of Thinking (IDyOT) model brings together symbolic and non-symbolic cognitive architectures, by combining sequential modelling with hierarchical symbolic memory, in which symbols are grounded by reference to their perceptual correlates. This is achieved by a process of chunking, based on boundary entropy, in which each segment of an input signal is broken into chunks, each of which corresponds with a single symbol in a higher level model. Each chunk corresponds with a temporal trajectory in the complex Hilbert space given by a spectral transformation of its signal; each symbol above each chunk corresponds with a point in a higher space which is in turn a spectral representation of the lower space. Norms in the spaces admit measures of similarity, which allow grouping of categories of symbol, so that similar chunks are associated with the same symbol. This chunking process recurses “up” IDyOT’s memory, so that representations become more and more abstract.
It is possible to construct a Markov Model along a layer of this model, or up or down between layers. Thus, predictions may be made from any part of the structure, more or less abstract, and it is in this capacity that IDyOT is claimed to model creativity, at multiple levels, from the construction of sentences in everyday speech to the improvisation of musical melodies.
IDyOT’s learning process is a kind of deep learning, but it differs from the more familiar neural network formulation because it includes symbols that are explicitly grounded in the learned input, and its answers will therefore be explicable in these terms.
In this talk, I will explain and motivate the design of IDyOT with reference to various different aspects of music, language and speech processing, and to animal behaviour.