Research Area:  Machine Learning
The rewriting method for text summarization combines the advantage of extractive and abstractive approaches, improving the conciseness and readability of extractive summaries. Exiting rewriting systems take extractive sentences as the only input and rewrite each sentence independently, which may lose critical background knowledge and break cross-sentence coherence of the summary. To this end, we propose contextualized rewriting to consume the entire document and maintain the summary coherence, representing extractive sentences as a part of the document encoding and introducing group-tags to align the extractive sentences to the summary. We further propose a general framework for rewriting with an external extractor and a joint internal extractor, representing sentence selection as a special token prediction. We demonstrate the frameworks effectiveness by implementing three rewriter instances on various pre-trained models. Experiments show that contextualized rewriting significantly outperforms previous non-contextualized rewriting, achieving strong improvements on ROUGE scores upon multiple extractors. Empirical results further suggest that joint modeling of sentence selection and rewriting can largely enhance performance.
Keywords:  
Author(s) Name:  Guangsheng Bao, Yue Zhang
Journal name:  IEEE/ACM Transactions on Audio, Speech, and Language Processing
Conferrence name:  
Publisher name:  IEEE
DOI:  10.1109/TASLP.2023.3268569
Volume Information:  Volume 31, Pages 1624-1635,(2023)
Paper Link:   https://ieeexplore.ieee.org/document/10109120/authors#authors