Multi-View Source Ablation for Faithful Summarization

Shuyang Cao, Liang Ma, Di Lu, Robert L. Logan IV, Joel Tetreault, and Alejandro Jaimes (Finding of EACL 2023)

Abstract:

In this paper, we present MuFaSSA (Multi-view Faithfulness Scoring via Source Ablation), a metric for evaluating faithfulness of abstractive summaries, and for guiding training of more faithful summarizers. For evaluation, MuFaSSA employs different strategies (e.g., masking entity mentions) to first remove information from the source document to form multiple ablated views. Then, the faithfulness level of each token in a generated summary is measured by the difference between the token generation probabilities when given the original document and the ablated document as inputs to trained summarizers. For training, MuFaSSA uses a novel word truncation objective that drops unfaithful tokens located by MuFaSSA in both the decoder input and output. Alignments with human-annotated faithfulness labels on AggreFact show that MuFaSSA is comparable to or better than existing metrics built on classifiers or QA models pre-trained on other tasks. In experiments on summarization with XSum and CNN/DailyMail, models trained with word truncation using MuFaSSA outperform competitive methods according to both automatic faithfulness metrics and human assessments.

Code:

[PDF] [Code]