The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

sentiment extraction for non-English

wclasterwclaster Member, University Professor Posts: 43 University Professor
Hello. Are there sentiment analysis operators or tools for working with Japanese? How about Chinese? And how about other Asian languages? I saw the Sentiment Extract operator. It seems to have German and French versions for Vader. Thank you!

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Solution Accepted
    in principal yes, but definitely this is nothing one can do quickly.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • wclasterwclaster Member, University Professor Posts: 43 University Professor
    Thank you! I will leave this question open because I am really looking for Japanese.
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    if you have chinese or japanase dictionaries i can add them :). Not a big thing. The bigger one would be tokenization in those languages.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • ceaperezceaperez Member Posts: 541 Unicorn
    Hi @wclaster
    I hope you can solve this issue and then you can share your good practice

    regards
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    @ceaperez by the way, if you have a good Spanish dictionary I am happy to add this as well :). I didn't find anything in a quick search for one. ideally i want to cover the big languages with a dictionary each.
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • wclasterwclaster Member, University Professor Posts: 43 University Professor
    Hello mschmitz, thank you. Yes, I think tokenization would be quite a challenge. MeCab is an open-source text segmentation library for use with text written in the Japanese language but I don't know how this would all fit together. 
    From Wikipedia
    Besides segmenting the text, MeCab also lists the part of speech of the word, and, if applicable and in the dictionary, its pronunciation.

    MeCab - Wikipedia
    Would this be simple?
  • ceaperezceaperez Member Posts: 541 Unicorn
    @mschmitz. Thanks for your help. I will check if I have a good one. 
    regards. 
  • kaymankayman Member Posts: 662 Unicorn
    Bit late to the party but we had some decent results using Ginza together with Spacy, using the python extension in some of our rapidminer workflows.
Sign In or Register to comment.