Semi-Supervised Extractive Question Summarization Using Question-Answer Pairs
Kazuya Machida (Tokyo Tech), Tatsuya Ishigaki (Tokyo Tech), Hayato Kobayashi, Hiroya Takamura (AIST/Tokyo Tech), Manabu Okumura (Tokyo Tech)
The 42nd European Conference on Information Retrieval (ECIR 2020), 2020/4
自然言語処理 (Natural Language Processing) 情報検索 (Information Retrieval) 機械学習 (Machine Learning)
- Neural extractive summarization methods often require much labeled training data, for which headlines or lead summaries of news articles can sometimes be used. Such directly useful summaries are not always available, however, especially for user-generated content, such as questions posted on community question answering services. In this paper, we address an extractive summarization (i.e., headline extraction) task for such questions as a case study and consider how to alleviate the problem by using question-answer pairs, instead of missing-headline pairs. To this end, we propose a framework to examine how to use such unlabeled paired data from the viewpoint of training methods. Experimental results show that multi-task training performs well with undersampling and distant supervision.
Slides Download (705KB)