If you are interested in our shared task, we would like to help you get started.
Here we provide a collection of resources and tools that you might find useful. We would be very grateful for any suggestions that you might have.
STEPS Evaluation Tool
The version for the 2016 iteration of the shared task will be made available soon.
Opinion Role Extraction Systems
Saarland University’s lexicon-based system
Corpora and Lexicons
German resources
SentiWS - a Publicly Available German-language Resource for Sentiment Analysis
GermanPolarityClues - A Lexical Resource for German Sentiment Analysis
SALSA Corpus - German corpus with semantic roles labeled in the FrameNet style
GermaNet - a WordNet-style resource for German
English resources
There are also some English resources that you might use if you are interested in doing something using translation:
MPQA corpus- a standard corpus for sentiment analysis for English
Subjectivity Lexicon - a widely used sentiment lexicon for English
Tools
Salto annotation tool, Salsa API - this tool was used to annotate the data
Berkeley Parser - constituency parser with models for German
ParZu - a dependency parser for German (demo available) — note that a morphology is not necessary to run the tool; recommend to run the tool on pre-tagged text (use TreeTagger output)
Stanford Parser - another well-known constituency parser with models for German
sempar - semantic role labeling software for German
Shalmaneser - the first semantic role labeler for German based on a FrameNet representation; the Semafor role labeler has no model for German; Shalmaneser also works on English)
Shalmaneser - open source - an attempt to keep the original Shalmaneser and its components up to date as the original creators are not currently maintaining it
convert_treebank (Lingua-Align-0.04) — note that you run the tool on command line with convert_treebank [-i] infile informat outformat > outfile
TIGER Corpus - this website contains information on TIGERxml and a conversion tool to dependency structures
TIGERSearch and TIGERRegistry - home page of tools for indexing and viewing corpora based on TIGER-xml
TreeTagger - part-of-speech tagger that also does lemmatization; suitable for German
RFTagger - another POS-tagger for German
Morphisto - morphological analysis tool for German
SemiNER - Named-entity recognizer for German
Named-entity recognizer for German by Faruqui and Padó
Giza++ - you will also find tutorials on the web that describe common applications such as word alignment using that tool
CRF++ - an open source implementation of Conditional Random Fields (CRFs) for sequence labeling
UIMA - Apache system for Unstructured Information Management applications
DKPro (Darmstadt Knowledge Processing Repository)
Acknowledgments
The initial lists on the page were contributed by Michael Wiegand.