The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
General Results
Hi, I am new to RapidMiner and I am learning how to use it correctly. I did a simple process of tokenize but the results are quite unusual from other videos and instruction that I saw. To go more into details, I used two operators (read document, process document) and tokenize in the sub-process; but I am receving this results, with an uncomfartable layout. I am attaching a screenshot, do you know where I am doing wrong ? why am I getting this result ?
Thanks!
0
Best Answer
-
rfuentealba RapidMiner Certified Analyst, Member, University Professor Posts: 568 UnicornHello @naxiota,
Ok, let's see how to help you. Since you are learning, I prefer you to see and repeat with me:
You have one file. Here I used a version of the Holy Bible (please see my disclaimer at the end), so I used the Open File operator to read it, extract the content with the Read Document operator and then pass it as a document to Process Documents.
Inside the Process Documents super-operator (the kinds of operators that let you put stuff inside those), I did this:
Well...- Tokenizing is boring because it's more an internal representation of words, you won't get meaningful things from that.
- Also, it is a good idea to Transform Cases because it brings together words with mixed cases (those are not considered the same under normal comparison algorithms).
- Now on to something useful: Let's extract all the Nouns from the content. You may be able to do many other things, but this is a simple exercise: you can extract the POS tags using this https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
I added the file but please try figuring things out with the pictures I sent and the explanations first, it helps better.
Hope this helps,
Rodrigo.
10
Answers
It all depends on what you are trying to accomplish. The best way is to check tutorials available on your Repository tab. Since RapidMiner is huge, I would recommend you to take a project on your own and learn how to transform data.
I can provide you with a few things, but currently I’m at the office. Ping me later!