Powered by Google
DeepTutor - Promoting Deep Learning




DeepTutor Resources

Selected datasets published by the DeepTutor team

NOTE: By downloading the corpus you agree to the following LICENSE AGREEMENT: click here to access the LICENSE AGREEMENT.

1. DARE corpus


The DARE corpus is an annotated data set focusing on pronoun resolution in tutorial dialogue. Although data sets for general purpose anaphora resolution exist, they are not suitable for dialogue based Intelligent Tutoring Systems. The DARE corpus consists of 1,000 annotated pronoun instances collected from conversations between high-school students and the intelligent tutoring system DeepTutor.
Download the dataset.
  • Niraula, N. B., Rus, V., Banjade, R., Stefanescu, D., Baggett, W., & Morgan, B. (2014). The DARE Corpus: A Resource for Anaphora Resolution in Dialogue Based Intelligent Tutoring Systems. In LREC (pp. 3199-3203).
    Link to pdf
  • Niraula, N. B., & Rus, V. (2014, April). A Machine Learning Approach to Pronominal Anaphora Resolution in Dialogue Based Intelligent Tutoring Systems. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 307-318). Springer Berlin Heidelberg.
    Link to pdf

2. DT-Grade dataset


The DT-Grade corpus consists of short constructed answers extracted from tutorial dialogues between students and DeepTutor system and annotated for their correctness in the given context and whether the contextual information was useful. The dataset contains 900 answers (of which about 25% required contextual information to properly interpret them).
Download the dataset.
  • Banjade, R., Maharjan, N., Niraula, N. B., Gautam, D., Samei, B., & Rus, V. (2016). Evaluation Dataset (DT-Grade) and Word Weighting Approach towards Constructed Short Answers Assessment in Tutorial Dialogue Context. In proceedings of BEA11 workshop (co-located with NAACL).
    Link to pdf

3. DT-Neg dataset


In this paper, we present DT-Neg corpus (DeepTutor Negation corpus) which contains texts extracted from tutorial dialogues where students interacted with an Intelligent Tutoring System (ITS) to solve conceptual physics problems. The DT-Neg corpus contains annotated negations in student responses with scope and focus marked based on the context of the dialogue. Our dataset contains 1,088 instances and is available for research purposes.
Download the dataset.
  • Banjade, R., & Rus, V. (2016). Dt-neg: Tutorial dialogues annotated for negation scope and focus in context. LREC.
    Link to pdf
  • Banjade, R., Niraula, N., & Rus, V. (2016). Towards Detecting Intra- and Inter-Sentential Negation Scope and Focus in Dialogue . In proceedings of FLAIRS conference.
    Link to pdf




DEPARTMENT OF COMPUTER SCIENCE · Dunn Hall 209, Memphis, TN 38152-3240 · Phone 901.678.5465 · Fax 901.678.1506 · info@cs.memphis.edu