The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Using Word Vector in Model"
Hi
Issue/Problem:
- I have created a Word Vector using the "Process to Documents from Data operator"
- I configured the operator to create Term Occurences.
- This generated over 3,000 atrributes which shows the number of times each word appears in an example. Eg the word "good" appears 10 times in row 5 (so far so good)
- I now want to select some of these words and use them as an atttribute when building a model.
- I thought the Select Attribute operator would do this, but it only shows the original attributes and not the new word vectors that were created.
Can someone point me to the correct operator so that I can select the word vectors I want to use?
Thanks
Duffy
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
Hi Duffy,
the problem here is the metadata propagation. RM cannot predict based on the metadata which attributes will be present. What you can try is to take the meta data from the last execution. To do this try Process->Synchronize Meta Data with Real Data and run it once.
~Martin
- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany0
Answers
Thanks Martin for your reply.
I understand the problem.
Your solution solved the problem.
However, before marking this thread as as "Solved", it would be preferable to avoid the long process of generating a word vector and just generate "word occurrences" for a pre-defined set of words.
For example, I have 5 words or phrases (good, great, wonderful, bad, not good) I want to know how frequently they are mentioned in the text.
What operator would I use to extract this information?
Duffy
Hi,
good question. I would built a dummy work vector on one text to get a word list. Afterwards you can plug this word list in your usual Process Documents to just get the 5 words you want.
~Martin
Dortmund, Germany
Thanks