|Item Name:||Discourse Graphbank|
|Author(s):||Florian Wolf, Edward Gibson, Amy Fisher, Meredith Knight|
|LDC Catalog No.:||LDC2005T08|
|Release Date:||March 15, 2005|
|Application(s):||discourse analysis, information retrieval, summarization|
LDC User Agreement for Non-Members
|Online Documentation:||LDC2005T08 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Wolf, Florian, et al. Discourse Graphbank LDC2005T08. Web Download. Philadelphia: Linguistic Data Consortium, 2005.|
Discourse Graphbank contains 135 newswire texts totalling 70,000 words annotated with coherence relations.
The project was Florian Wolf's PhD thesis and aimed to define a descriptively adequate data structure for representing discourse coherence structures, investigated the impact of discourse coherence structures on other linguistic processes and natural language applications (e.g. anaphora resolution, summarization and information retrieval), and developed and tested discourse parsing algorithms.
The source data consists of Assoicated Press and Wall Street Journal newswire data from TIPSTER Complete (LDC93T3A) annotated with coherence relations.
The data was annotated by two independent annotators with 88% agreement. The annotators notated 11 types of coherence relations:
|Temporal Sequence relation|
For an example of the data in this corpus, please view this sample (JPG).
None at this time.