The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Sentiment Analysis in German
katyaegodigital
Member Posts: 1 Learner I
in Help
Hi,
I am new to RapidMiner, but I would like to use it for sentiment analysis to analyze comments on social networks.
The question is it possible to do with german language?
Any help is appreciated.
Thx,
Katya
I am new to RapidMiner, but I would like to use it for sentiment analysis to analyze comments on social networks.
The question is it possible to do with german language?
Any help is appreciated.
Thx,
Katya
Tagged:
0
Answers
The out of the box options (Vader, sentiwordnet etc) typically use models trained on English content, so you won't get far with German or any other language.
However, if you would have training data available, or can generate this, there is of course no reason at all why you couldn't use German and just generate your own 'sentiment' set, this is in essence just a classification task.
If you don't have trained data available nor the time to manually annotate a data set the 'easiest' way to get this is to crawl sites that offer reviews (for instance amazon.de or otto.de). Reviews are always rated 1 to 5, so you could consider everything scoring 1 or 2 as negative, and 4 or 5 as positive. Use these to define your labels, pre-process your source content (casing, stopwords, lemmatizing etc) and get the top keywords associated with both (scheisse, schlecht etc, you get the idea)
You'll need to combine some different skills but this is indeed possible to do with Rapidminer
To complement the response by my sensei @kayman: It is indeed possible to accomplish a lot with RapidMiner. I used Sentiment Analysis as part of a fraud research I conducted in Switzerland and have just two tips for you:
- If your coding skills are good, experiment with this: https://github.com/hdaSprachtechnologie/odenet. I was able to parse the XML (using Ruby, sorry) and use it as a WordNet. You may also want to get the http://www.sfs.uni-tuebingen.de/GermaNet/licenses.shtml GermaNet collection of words.
- You may want to play with POS tagging. Use the Python Scripting Extension and the pattern library for that. https://www.clips.uantwerpen.be/pattern. I don't know if my lack of skills in German played against me on this project (though @mschmitz thinks it's good), but I was able to get much better results with a bit of Python embedded in my processes.
Don't hesitate to contact me if you need a bit more help. (Though my NLP master is @ghislaine_gueri, I invoke her).All the best,
Rodrigo.
Dortmund, Germany