We found some problems with the datasets, based on the feedback of participants. We have updated dataset. For detailed information, please check the 'data' webpage.
This competition is one of in the Beijing Academy of Artificial Intelligence’s 10 AI competitions in 2019.
Knowledge sharing services have become one of the most important and popular applications on the global Internet. In a knowledge sharing (or Q&A) community, the number of questions far exceeds the number of quality responses. Therefore, how to connect knowledge, experts and users, and increase the willingness of experts to respond has become the central topic of knowledge sharing services. This competition is designed to solve this problem.
Zhihu is a well-known knowledge sharing and Q&A community platform in China. Since its launch in 2011, Zhihu now has 220 million users, who ask hundreds of thousands of new questions or generate other content every day. Therefore, recommending right questions to most relative experts accurately and efficiently becomes an import task at Zhihu. The task includes mining and finding experts who are interested in the given questions and have enough expertise to answer them.
The datasets released include the information of questions remained to be answerd by experts (or invitees), expert (or invitee) profiles, the experts (or invitees)’ history of accepting invitation and answering behavior.
1. Question Information. Incudes <question id, question publishing time, question’s related topics, question’s content text and question’s description> on Zhihu’s Q&A community.
2. Expert (or invitee)’s answers, include <answer ID, question ID, author (or expert/invitee) ID, the text content of the answer, answer creation time, number of upvotes received, number of “bookmarks” by other users, number of “appreciates” by other users, number of comments> and etc.
3. User profiling data, includes <user ID, gender, user activity frequency, followed topics, long-term interests, ‘salt value’> and etc. ‘Salt value’, in general, is a credit indicator at Zhihu, which is decided by quantity and quality of the content the user has created, profile completeness, educational degrees, work experience and other qualifications, friendliness, contribution to the community and other factors.
4. The information of <topics, token(words), single character embedding (64 dimensions)>.
5. Invitation data within the previous month, include <question id, user id, invitation time, whether or not answer a question>.
BAAI-Zhihu Expert Finding