The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Getting started with sentiment analysis"
Hi,
I am seeking advice on how to get started with a sentiment-analysis in the least painfull way. I am currently writing my bachelor thesis about social behaviour on online forums. For this, i have been crawling topics on a danish forum for the last 2 months, and it finally looks like i have the data i need.
I am doing the most basic statistical analysis in SPSS, where i will compare user-rank, the amount of posts the user has made, to the amount of answers to his or her topics. However, i also have the topic text, which i would love to classify using the logic of sentiment analysis.
As you might have guessed, i am totally new to rapidminer. I have been trying to copy-paste the workflow of the accelerator sentiment analysis. But it seems i keep getting errors about my data format. However, I have only two colums: post & category. In the category column, i have mapped some of the rows with "Positive" and others with "Negative". The text in the rows is in danish, and some topics contain links, quotation marks etc.
You can have a look at my csv-file here:
https://dl.dropboxusercontent.com/u/3592722/Holdout.csv
And here's the error i get:
The most important two classifications i need to create/predict are:
- Positive/negative
- Subject (based on a list of subjects, with each of their keywords)
So here are the questions:
1) What am i doing wrong in the sentiment analysis?
2) Is it possible to make a prediction model, that classifies topics and labels them with subject names, based on their use of keywords(apple, win etc.)?
I have one month left to get to learn this stuff. Does that seem realistic?
Thanks in advance,
I am seeking advice on how to get started with a sentiment-analysis in the least painfull way. I am currently writing my bachelor thesis about social behaviour on online forums. For this, i have been crawling topics on a danish forum for the last 2 months, and it finally looks like i have the data i need.
I am doing the most basic statistical analysis in SPSS, where i will compare user-rank, the amount of posts the user has made, to the amount of answers to his or her topics. However, i also have the topic text, which i would love to classify using the logic of sentiment analysis.
As you might have guessed, i am totally new to rapidminer. I have been trying to copy-paste the workflow of the accelerator sentiment analysis. But it seems i keep getting errors about my data format. However, I have only two colums: post & category. In the category column, i have mapped some of the rows with "Positive" and others with "Negative". The text in the rows is in danish, and some topics contain links, quotation marks etc.
You can have a look at my csv-file here:
https://dl.dropboxusercontent.com/u/3592722/Holdout.csv
And here's the error i get:
The most important two classifications i need to create/predict are:
- Positive/negative
- Subject (based on a list of subjects, with each of their keywords)
So here are the questions:
1) What am i doing wrong in the sentiment analysis?
2) Is it possible to make a prediction model, that classifies topics and labels them with subject names, based on their use of keywords(apple, win etc.)?
I have one month left to get to learn this stuff. Does that seem realistic?
Thanks in advance,
Tagged:
0
Answers
your csv file uses "," as column separator. This is not optimal because in your text column there are
probably also a lot ",". Choose another separator that does not exist in your text column. You may also
use excel format where such problems do not occur. Additionally, rename your category column to
"Sentiment".
Cheers,
Frank
Help!
Thank you!!!