Home   >  Competitions   > 

 

Background Introduction

 

The conference on Natural Language Processing and Chinese Computing (NLPCC) is the annual conference of CCF TCCI (Technical Committee of Chinese Information, China Computer Federation). The NLPCC conferences have been successfully held in Beijing (2012),Chongqing (2013), Shenzhen (2014), Nanchang (2015), Kunming (2016) and Dalian (2017). This year's NLPCC conference will be held in Hohhot on Aug 26 - 30, 2018.

 

NLPCC 2017 will follow the NLPCC tradition of holding several shared tasks in natural language processing and Chinese computing. This year’s shared tasks focus on both classic problems and newly emerging problems, including Chinese Word Semantic Relation Classification, News Headline Categorization, Single Document Summarization, Emotional Conversation Generation, Open Domain Question Answering, and Social Media User Modeling.

 

Participants from both academia and industry are welcomed. Each group can participate in one or multiple tasks and members in each group can attend the NLPCC conference to present their techniques and results. The participants will be invited to submit papers to the main conference and the accepted papers will appear in the conference proceedings published by Springer LNCS.

 

Overview

 

Traditional news document summarization techniques have been widely explored on the DUC and TAC conferences, and existing datasets for document summarization are mainly focused on western languages, while Chinese news summarization has seldom been explored. In this evaluation task, we aim to investigate single document summarization techniques for automatically generating short summaries of Chinese news articles. We will provide a large dataset for evaluating and comparing different document summarization techniques.

 

The Task

 

The single document summarization task is defined as a task of automatically generating a short summary for a given Chinese news article, and the short summary is used for news browsing and propagation on Toutiao.com. The length of the short summary is less than 60 Chinese characters. We will provide a sample/training dataset consisting of a large number of Chinese news articles with reference summaries, together with a large number of news articles without reference summaries (for semi-supervised methods). The test dataset will be provided to the participants later.

 

More Datasets and Competitions

 

For more datasets and competitions, please visit CCF TCCI's dataset and competition page.

NLPCC2018

Knowledge

34 teams

start

Final Submissions

2018-05-07

2018-05-17