Home   >  Competitions   > 


Submission Files


Each team can submit validation set forecast results anytime during the course of the contest in a zip file which contains multiple txt files. Each txt file should be named as one article’s ID and contain the generated title for that article. You can find the sample submission file in the data webpage.


Important Note:

Files shall not contain any extra spaces

Please add an empty line to the end of the file

The encode for all files is utf-8 without bom.


Tokenization Before Submission


Please tokenize your results before making your submissions. We use Stanford CoreNLP tokenization.

For more details of Stanford CoreNLP Tokenization, please go to: https://stanfordnlp.github.io/CoreNLP/download.html


1. ``` wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-01-31.zip unzip stanford-corenlp-full-2018-01-31.zip & cd stanford-corenlp-full-2018-01-31 java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \ -preload tokenize,ssplit,pos,lemma,parse,depparse \ -status_port 9000 -port 9000 -timeout 15000 ```


2.In Python ``` In [1]: from nltk.parse.corenlp import CoreNLPParser In [2]: stanford = CoreNLPParser() In [8]: str = 'proved to be fake, made-up' In [9]: token = list(stanford.tokenize(str)) In [10]: token Out[10]: [u'proved', u'to', u'be', u'fake', u',', u'made-up'] ```


Please tokenize your results before making your submissions. We use Stanford CoreNLP tokenization.


Evaluation Metric


The competition adopts Rouge (Recall-Oriented Understudy for Gisting Evaluation) as the evaluation metric. Rough is a common method to evaluate the performance of machine translation and summarization models. It compares the generated text and references (text labeled by human), and calculate the scores based on their similarity.



Recall_{lcs}= \frac{LCS(X, Y)}{len(Y)}\\

Precision_{lcs} = \frac{LCS(X, Y)}{len(X)}\\


\beta = \frac{Precision_{lcs}}{Recall_{lcs}+e^{-12} }\\



X is text generated by modes, Y is the reference, or the titles labeled by human editors. When multiple references existed, the best rough-I score will be used. LCS herein stands for the longest common subsequence problem.



Byte Cup 2018 International Machine Learning Contest


1069 Teams


Final Submissions