Aspects of Multilingual News Summarisation

Abstract: 

In this book chapter, we discuss several pertinent aspects of an automatic system that generates summaries in multiple languages for sets of topic-related news articles (multilingual multi-document summarisation), gathered by news aggregation systems. The discussion follows a framework based on Latent Semantic Analysis (LSA) because LSA was shown to be a high-performing method across many different languages. Starting from a sentence-extractive approach we show how domain-specific aspects can be used and how a compression and paraphrasing method can be plugged in. We also discuss the challenging problem of summarisation evaluation in different languages. In particular, we describe two approaches: the first uses a parallel corpus and the second statistical machine translation.

Authors
Authors: 
STEINBERGER Ralf, TANEV Hristo, ZAVARELLA Vanni, STEINBERGER Josef, TURCHI Marco
Publication Year
Publication Year: 
2014
Type

Type:

Appears in Collections
Appears in Collections: 
Institute for the Protection and Security of the Citizen
Science Areas
Science Areas: 
Keywords
JRC Institutes
Publisher
Publisher: 
IGI Global
ISBN
ISBN: 
978-1-4666-5019-0 (print),978-1-4666-5020-6 (online)
ISSN
ISSN: 
2327-1981 (print),2327-199X (online)
Citation
Citation: 
Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding p. 277-294
Related Topics