The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Sentiment Analysis
crimson_crow
Member Posts: 3 Learner II
in Help
Hello! I`m a new one to RapidMiner and I want to learn Sentiment Analysis for my coursework. The purpose is to build a model which can estimate what reviews are: positive, or negative. In program there is an example of the process, but I want to change a couple of things:
1. Replace an example set with my own which has more data
2. Instead of a document with only one review to be estimated by a model I want to use a .xlsx file with reviews which I parsed from IMDb site.
The problems are in "Cross Validation" operator in the screenshot "First Problem", and in "Read Document" operator on the screenshot "Second Problem".
I can`t understand why "Cross Validation" operator has the problem of type because my data has the same structure as in the example, and what operator should I use to read parsed data in .xlsx file correctly?
1. Replace an example set with my own which has more data
2. Instead of a document with only one review to be estimated by a model I want to use a .xlsx file with reviews which I parsed from IMDb site.
The problems are in "Cross Validation" operator in the screenshot "First Problem", and in "Read Document" operator on the screenshot "Second Problem".
I can`t understand why "Cross Validation" operator has the problem of type because my data has the same structure as in the example, and what operator should I use to read parsed data in .xlsx file correctly?
Tagged:
0
Best Answer
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 UnicornHi @crimson_crow,
Thanks for sharing your process and data.
You have to :
- Apply the same pre-processing step(s) in your training branch and in your scoring branch, thus put a Nominal to Text operator (you don't need a Read Document operator) in your score branch.
- Set a Process Document from Data in your scoring branch (like in your training branch)
- Simplify your Cross Validation operator : I just use a SVM model in the training part and use an Apply Model and a Performance (Binominal Classification) in the test part.
In attached file, the working process.
Regards,
Lionel7
Answers