In his invited lecture on ``Future directions of machine translation'' at COLING-86, Jun-ichi Tsujii discussed the need for machine translation (MT) to take into account a wider range of factors in order to achieve improvement in translation quality.
``...We have to extract explicitly more kinds of information from source texts than deep case structures and utilize these to compute descriptions of the target sentences.'' [Tsujii 1986]
He described a framework for future MT systems, in which source texts are analysed into ``a set of monolingual factors which collectively determine surface structures of source texts'', and target texts are generated from ``a set of monolingual factors which collectively determine surface structures of target texts''. These factors need to include not only syntactic and semantic factors, but also discourse factors, pragmatic factors, and further ``factors of certain aspects of understanding''. The transfer stage between analysis and generation would need to invoke understanding processes which are required for the specific language pair.
However, he also pointed out that generation of the target texts needs to use other information, which is frequently impossible to obtain from the source text.
``It often happens that to determine target surface expressions requires a set of factors which are not expressed at all in the source language...'' [Tsujii 1986]
Such factors occur particularly frequently in translation from Japanese to English and other European languages, as Japanese does not express definiteness or number of noun phrases, and regularly omits subjects and objects of verbs. I experienced direct confirmation of the difficulties caused by these missing factors, in my own work on English generation in Japanese-English MT systems (Sections 4.1.1 and 4.2.1).
An approach to the inclusion of the wider range of essential factors in the generation stage, following Tsujii's proposed framework, requires the following three problems to be solved: