The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"context/feature based opinion mining/sentiment analysis"
Hello everybody,
I'm pretty new to Rapidminer, and I'm stuck on the following problem.
I managed to build a simple sentiment classifier following the Pang's theory and the examples on the Internet (especially those on vancouverdata). Now i'd like to extend the concept by extracting the specific features (n-grams) and showing their sentiment score.
For example, let's have the following phrase: "the camera has a pretty good focus, but its flash lacks of speed". I have the two features focus (positive), and flash (negative).
Could you help me get through the pain?
Thank you in advance,
I'm pretty new to Rapidminer, and I'm stuck on the following problem.
I managed to build a simple sentiment classifier following the Pang's theory and the examples on the Internet (especially those on vancouverdata). Now i'd like to extend the concept by extracting the specific features (n-grams) and showing their sentiment score.
For example, let's have the following phrase: "the camera has a pretty good focus, but its flash lacks of speed". I have the two features focus (positive), and flash (negative).
Could you help me get through the pain?
Thank you in advance,
Tagged:
0
Answers
If your mining is English examples separated by commas, then it's straightforward. You just split on the comma.
Let's assume that you don't have that luxury, however I am going to assume that you have the posts all on the one subject.
So for example:
"the camera has a pretty good focus but its flash lacks of speed"
"The Canon Sureshot has a pretty good focus and flash, but tastes awful without ketchup."
"I've always liked the focus on my Canon, but really think the lightmeter is poor."
I'd suggest the following approach (others may disagree):
First I'd add an ID so you can split up the documents in many ways, but still combine them again later.
(this is where I think my approach is wrong)
do you remove "pretty good focus and flash" and just keep "pretty good focus"?
- 4: build a sentiment mining model from the N-Grams
- 5: have a look on the most positive / least positive words in the N-Grams (that aren't features) and see if they should be added to the labelling in step 2.2
After repeating this process a few times on the sample data it should be possible to join your N-Grams up with your list of features to show what the overall sentiment balance is for the individuale.g. focus 30 / 45 / 25 (positive, negative, neutral).
I won't put together a sample process though as I think there are probably better ideas than mine on here.
I am using RapidMiner for my final thesis about feature-based sentiment analysis and I face the same problem like you. However I would like to know if you already find ways to solve it.
Could you explain it to me?
Also thanks JEdward for sharing.
Thank you so much.