PropBank is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced by Martha Palmer et al., the term propbank is also coming to be used as a common noun referring to any corpus that has been annotated with propositions and their arguments.
The PropBank project has played a role in recent[when?] research in natural language processing, and has been used in semantic role labelling.
PropBank differs from FrameNet, the resource to which it is most frequently compared, in several ways.
PropBank is a verb-oriented resource, while FrameNet is centered on the more abstract notion of frames, which generalizes descriptions across similar verbs (e.g. "describe" and "characterize") as well as nouns and other words (e.g. "description"). PropBank does not annotate events or states of affairs described using nouns. PropBank commits to annotating all verbs in a corpus, whereas the FrameNet project chooses sets of example sentences from a large corpus and only in a few cases has annotated longer continuous stretches of text.
PropBank-style annotations often remain close to the syntactic level, while FrameNet-style annotations are sometimes more semantically motivated. From the start, PropBank was developed with the idea of serving as training data for machine learning-based semantic role labeling systems in mind. It requires that all arguments to a verb be syntactic constituents and different senses of a word are only distinguished if the differences bear on the arguments. Due to such differences, semantic role labeling with respect to PropBank is often a somewhat easier task than producing FrameNet-style annotations.
- ^ Palmer M, Kingsbury P, Gildea D (2005). "The Proposition Bank: An Annotated Corpus of Semantic Roles". Computational Linguistics. 31 (1): 71–106. CiteSeerX 10.1.1.136.8985. doi:10.1162/0891201053630264. S2CID 2486369.
- ^ Edward Loper; Szu-ting Yi & Martha Palmer (2007). "Combining Lexical Resources: Mapping Between PropBank and VerbNet" (PDF). Proceedings of the 7th International Workshop on Computational Linguistics.
- PropBank website
- NomBank website
- SALSA website
- American National Corpus
- Bank of English
- Bergen Corpus of London Teenage Language
- British National Corpus
- Brown Corpus
- Buckeye Corpus
- Cambridge English Corpus
- Corpus of Contemporary American English
- Enron Corpus
- International Corpus of English
- Lancaster-Oslo-Bergen Corpus
- Oxford English Corpus
- Spoken English Corpus
- Wellington Corpus of Spoken New Zealand English
- Bijankhan Corpus
- CorCenCC National Corpus of Contemporary Welsh
- Croatian Language Corpus
- Croatian National Corpus
- Czech National Corpus
- Europarl Corpus
- German Reference Corpus
- Hamshahri Corpus
- National Corpus of Polish
- Neo-Assyrian Text Corpus Project
- Persian Speech Corpus
- Quranic Arabic Corpus
- Russian National Corpus
- Scottish Corpus of Texts and Speech
- Slovenian National Corpus
- Tehran Monolingual Corpus
- Tekstaro de Esperanto
- TenTen Corpus Family
- Thesaurus Linguae Graecae
This computational linguistics-related article is a stub. You can help Wikipedia by expanding it.
This article about a digital library is a stub. You can help Wikipedia by expanding it.