J4 ›› 2012, Vol. 34 ›› Issue (4): 162-166.
• 论文 • Previous Articles Next Articles
NI Weijian,LIU Tong,ZENG Qingtian,ZHAO Hua,TANG Jianyu
Received:
Revised:
Online:
Published:
Abstract:
Machine learning based automatic document summarization approaches have drawn increasing attentions in the natural language processing literature. However, neither of them takes the imbalanced class distribution in automatic document summarization into account, i.e., the number of the sentences in summary is much fewer than that of in the whole document. It is obvious that the highly imbalanced data distribution will degrade the effectiveness of the conventional machine learning algorithms. This paper addresses the problem of automatic document summarization from a perspective of imbalanced classification and proposes two learning strategies to deal with the highly imbalanced distributed data in automatic singledocument summarization effectively. The experimental results on the DUC 2001 data set show the significant performance improvements of our approaches in terms of F1 and ROUGH2.
Key words: imbalanced classification;automatic document summarization;SVM;margin;bagging
NI Weijian,LIU Tong,ZENG Qingtian,ZHAO Hua,TANG Jianyu. Imbalanced Classification Approaches to Automatic SingleDocument Summarization[J]. J4, 2012, 34(4): 162-166.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2012/V34/I4/162