If you have any questions or comments, please send an email to support@kddcup2015.com. 


Update, August 4 :

Here is the Schedule of KDD Cup 2015 Workshop

KDD Cup 2015 Workshop Schedule

9:00 – 9:30 Opening: Information about the competition and our Sponsor, XuetangX.

9:30 – 10:30 Invited Talk: Jacob Spoelstra, Hang Zhang (Microsoft) Solving the KDD Cup 2015 Challenge Using Azure ML.

10:30 – 11:00 Coffee Break

11:00 – 11:25 10th Prize: Ikki Tanaka and Shunnosuke Ikeda Ensemble of Diverse Gradient Boosting Decision Trees for MOOCs Dropout Prediction.

11:25 – 11:50 9th Prize: Chih-Ming Chen, Man-Kwan Shan, Ming-Feng Tsai, Yi-Hsuan Yang, Hsin-Ping Chen, Pei-Wen Yeh, and Sin-Ya Peng A Linear Ensemble of Classification Models with Novel Backward Cumulative Features for MOOC Dropout Prediction.

11:50 – 12:15 7th Prize: Nguyen Minh Luan Combining Intention and Engagement Features with Ensemble of Models for MOOC Dropout Prediction.

12:15 – 13:45 Lunch (on your own)

13:45 – 14:10 6th Prize: Aakansh Gupta, Nuo Zhang, Kei Yonekawa, Kazunori Matsumoto, Shigeki Muramatsu, Rui Kimura, Nobuyuki Maita, Yujin Tang, Keiichi Kuroyanagi, Takafumi Watanabe, Akihiro Kobayashi, and Takuya Akiyama Approach to Generate a Vast Variety of Features for Predicting Dropouts in MOOC.

14:10 – 14:35 5th Prize: Jingming Liu A Time Series Feature Extractor for Predicting Dropouts in MOOC.

14:35 – 15:00 4th Prize: Ming-Lun Cai, Chih-Wei Chang, Liang-Wei Chen, Si-An Chen, Hsien-Chun Chiu, Hong-Min Chu, Yu-Jheng Fang, Yi Huang, Kuan-Hao Huang, Chih-Te Lai, Yi-An Lin, Chieh-En Tsai, Yeh-Wen Tsao, Yu-Lin Tsou, Wei-Cheng Wang, Yu-Ping Wu, Yao-Yuan Yang, Sheng-Chi You, Sz-Han Yu, Hsuan-Tien Lin, and Shou-De Lin NearUniform Aggregation of Gradient Boosting Machines for KDD Cup 2015.

15:00 – 15:15 Coffee Break

15:15 – 15:40 3rd Prize: Kenny Chua, Xavier Conort, Sergey Yurgenson, and Owen Zhang Featurizing Sequential Data - our Solution with XGboost.

15:40 – 16:05 2nd Prize: Yuichi Sugiyama, Kei Harada, Sayaka Yabu, Kazuki Onodera, Yuta Hino, Ryotaro Sano, Natsumi Kokubo, Daisuke Nishikawa, Sampei Nakabayashi, Masaaki Takada, Yasushi Iwata, Shinya Yazawa, Ryo Kato, and Tomomitsu Motohashi Feature Extraction for Predicting Dropouts and Feature Merging Experience with Data Veraci.

16:05 – 16:30 1st Prize: Jeong-Yoon Lee, Andreas Toescher, Michael Jahrer, Kohei Ozaki, Mert Bay, Peng Yan, Song Chen, Tam T. Nguyen, and Xiaocong Zhou Three-Stage Ensemble and Feature Engineering for MOOC Dropout Prediction.

16:30-17:30 The Serial Winner Panel (stay tuned!)

Update, July 19:

10 winners are listed here, one of the KDD Cup chairs will contact you soon.


Update, July 14:

The scores on the private leaderboard is now correct.


Update, July 13:

The rank of private leaderboard is correct, but the scores have some errors. We are fixing it.

Top 11 players are as follows. The final result is nearly same with the private rank. We will annouce the final winners today.

Rank Team (private score)

1 Intercontinental Ensemble (0.9074429656630387)

2 FEG&NSSOL@DataVeraci(0.9071299191403106)

3 DataRobot.com(0.9068038441658484)

4 CLMS(0.906652903315283)

5 ttllbb(0.9060343458177709)

6 KDDILABS&Keiku(0.9059718605039866)

7 FirstTimeEver(0.9058898193220947)

8 xiaochuan(0.9055778412303954)

9 Donquote(0.9052017384333861)

10 NCCU(0.9050289277229964)

11 kyazuki&DT@Keio univ. Ohmori Lab(0.9047145809931961)



Update, July 12:

The submission is closed. The final results with private scores will be published in several hours. 


Update, July 11:

The submission will be closed at 11:59PM, July 12, 2015 (UTC). 



1) Many people have asked the definition of "dropout". To better explain the definition, we extracted some information from the log file into a new file named "date.csv". However, all information regarding the data used for this competition has not been changed. So you do not need to change your existing code of algorithms. You can find the details of the file on "data" web page: https://kddcup2015.com/submission-data.html .

The sole purpose of creating this new file is to help you understand the definition of dropout. In brief, the timespan of each course is varied, thus the timespan for calculating dropouts depends on the course. We provide the timespan data of each course in date.csv. In addition, the timespan for calculating each course dropout is 10 days after the last day of that course. Please refer to the description of date.csv for more information about dropouts.

2) Because of this, we made a decision to extend the deadline of the competition to July 12, so everyone can refer to date.csv. Accordingly, we also shorten the time between the competition end date and the date of announcement, which is also July 12 now.

3) Microsoft generously provides their Azure Machine Learning services for KDD Cup participants. Please find theblog here on “Solving the KDD Cup 2015 Challenge Using Azure ML” . This blog provides detailed instruction on how to solve the KDD Cup challenge using Azure ML and achieve an accuracy of 0.87 AUC. This will help participants with a starting point for competing for the KDD Cup and help build even more accurate solutions on top. 

For participants who are new to Azure ML, it is a cloud platform for developing machine learning based predictive solutions . It provides a rich UI and battle tested algorithms from Bing and Microsoft Research along with support for R & Python. It also allows users to quickly operationalize their solutions as a webservice.




Students' high dropout rate on MOOC platforms has been heavily criticized, and predicting their likelihood of dropout would be useful for maintaining and encouraging students' learning activities. Therefore, in KDD Cup 2015, we will predict dropout on XuetangX, one of the largest MOOC platforms in China.  



The competition participants need to predict whether a user will drop a course within next 10 days based on his or her prior activities. If a user C leaves no records for course C  in the log during the next 10 days, we define it as dropout from course C For more details about log, please refer to the Data Descriptions


About XuetangX: 

XuetangX, a Chinese MOOC learning platform initiated by Tsinghua University, was officially launched online on Oct 10th, 2013. In April 2014, XuetangX signed a contract with edX, one of the biggest global MOOC learning platform co-founded by Harvard University and MIT, to acquire the exclusive authorization of edX’s high-quality international courses. In December 2014, XuetangX signed the Memorandum of Cooperation with FUN, the national MOOC platform in France, to make bilateral effort in course construction, platform development and other aspects. So far, there are more than 100 Chinese courses and over 260 international courses available on XuetangX.