Tools

If you are interested in our shared task, we would like to help you get started.

Here we provide a collection of resources and tools that you might find useful. We would be very grateful for any suggestions that you might have.

STEPS Evaluation Tool

The version for the 2016 iteration of the shared task will be made available soon.

Opinion Role Extraction Systems

Saarland University’s lexicon-based system

Corpora and Lexicons

German resources

SentiWS - a Publicly Available German-language Resource for Sentiment Analysis

GermanPolarityClues - A Lexical Resource for German Sentiment Analysis

SALSA Corpus - German corpus with semantic roles labeled in the FrameNet style

GermaNet - a WordNet-style resource for German

English resources

There are also some English resources that you might use if you are interested in doing something using translation:

MPQA corpus- a standard corpus for sentiment analysis for English

Subjectivity Lexicon - a widely used sentiment lexicon for English

Tools

Salto annotation tool, Salsa API - this tool was used to annotate the data

Berkeley Parser - constituency parser with models for German

ParZu - a dependency parser for German (demo available) — note that a morphology is not necessary to run the tool; recommend to run the tool on pre-tagged text (use TreeTagger output)

Stanford Parser - another well-known constituency parser with models for German

sempar - semantic role labeling software for German

Shalmaneser - the first semantic role labeler for German based on a FrameNet representation; the Semafor role labeler has no model for German; Shalmaneser also works on English)

Shalmaneser - open source - an attempt to keep the original Shalmaneser and its components up to date as the original creators are not currently maintaining it

convert_treebank (Lingua-Align-0.04) — note that you run the tool on command line with convert_treebank [-i] infile informat outformat > outfile

TIGER Corpus - this website contains information on TIGERxml and a conversion tool to dependency structures

TIGERSearch and TIGERRegistry - home page of tools for indexing and viewing corpora based on TIGER-xml

TreeTagger - part-of-speech tagger that also does lemmatization; suitable for German

RFTagger - another POS-tagger for German

Morphisto - morphological analysis tool for German

SemiNER - Named-entity recognizer for German

Named-entity recognizer for German by Faruqui and Padó

Giza++ - you will also find tutorials on the web that describe common applications such as word alignment using that tool

CRF++ - an open source implementation of Conditional Random Fields (CRFs) for sequence labeling

UIMA - Apache system for Unstructured Information Management applications

DKPro (Darmstadt Knowledge Processing Repository)

Acknowledgments

The initial lists on the page were contributed by Michael Wiegand.