翻訳と辞書 |
Treebank
In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empirical data.〔Alexander Clark, Chris Fox and Shalom Lappin (2010). The handbook of computational linguistics and natural language processing. Wiley.〕 The exploitation of treebank data has been important ever since the first large-scale treebank, (The Penn Treebank ), was published. However, although originating in computational linguistics, the value of treebanks is becoming more widely appreciated in linguistics research as a whole. For example, annotated treebank data has been crucial in syntactic research to test linguistic theories of sentence structure against large quantities of naturally occurring examples. == Etymology ==
The term ''treebank'' was coined by linguist Geoffrey Leech in the 1980s, by analogy to other repositories such as a seedbank or bloodbank.〔Sampson, G. (2003) ‘Reflections of a dendrographer.’ In A. Wilson, P. Rayson and T. McEnery (eds.) Corpus Linguistics by the Lune: A Festschrift for Geoffrey Leech, Frankfurt am Main: Peter Lang, pp. 157-184〕 This is because both syntactic and semantic structure are commonly represented compositionally as a tree structure. The term ''parsed corpus'' is often used interchangeably with the term treebank, with the emphasis on the primacy of sentences rather than trees.
抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Treebank」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|