The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"How to work and use word2vec"
Hello
I want to use Word2vec to convert sentences to a vector
But I do not know how to do it? My data is Twitter.
Please help me by sending the operator image
Thanks
Tagged:
0
Answers
Hi and welcome to the community.
What you need is first install the Word2Vec extension from the RapidMiner marketplace.
Once done you'll find the Word2Vec Operator in the Operators search bar (or use the global search).
You can check this excellent post by Martin on how to apply the extension:
https://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/Synonym-Detection-with-Word2Vec/ta-p/43860
Best,
David
Hello
Thank
I saw the link
Unfortunately, the pictures are not clear.
You may want to send an image of the use of this operator to analyze emotions
Many thanks
Dear @khazan,
the mentioned post has an attached .zip. It contains the full analysis so you can load it into your RM and use it yourself.
Best,
Martin
Dortmund, Germany
Hello
Thank you for your attention
I downloaded the sample file to use Word2vec And I entered the RapidMiner.
But there is an error.
And I do not know what the process of using this operator is.
Please give me guidance.
thank you again
Hi,
this operator expects a collection of tokenized documents as an input. Not an Example Set.
~Martin
Dortmund, Germany
I also try this nice example and get this error
My input data
My data looks like this
Any hint...?
EDIT - Add RM Logfile
Thanks!
Thomas
Hi,
how many docs do you feed in? More than number of negative samples? Can you please try more?
Best,
Martin
Dortmund, Germany
Only this one example from the web page of point 1
The data is provided in one flat file for each hotel with the following structure:
Hi,
you need to provide more data to be able to run word2vec. A single example won't work to train the model. I've tested it and it starts to work with 5 examples.
Best,
Martin
Dortmund, Germany
I copied the original record and changed something. Now the database is two *.dat files.
After this, I get the same error...
EDIT
Ok, I build 5 data files and try again...
Regards,
Thomas
Right, with five files, the process works. I continue testing now...
Thanks!
Thomas
@mschmitz
You wrote: "how many docs do you feed in? More than number of negative samples? Can you please try more?"
Now I try with five small (german) Textsample and the process failed like before. Is there any lower boundary or param?
Please explain..
Regards
Thomas
Hi,
well the algorithm itself makes only sense if you have a lot of data. Honestly it would make some sense if i add an error for less than a thousand examples. Less is usally not yielding good results.
I am not sure where the exact bounday of 5 comes from though.
Best,
Martin
Dortmund, Germany
Hello
I've preprocessed and tokenize the data, but it has an error.
look?
Please please
Thankful
Hi,
Process Documents converts this into a bag of words. Please have a look on the example processes. It shows you how to do this with loop collection.
Best,
Martin
Dortmund, Germany
I want to use word2vec to analyze emotions
I downloaded the tutorial from the site
Instead of entering my text file, I have a csv file and xlsx file
As transmitted by my loop operator, I read the excel operator, but it has an error in the output
Please provide guidance on how to use this sample for my csv file
And that
These addresses
C: \ Users \ Martin \ Arbeit \ Tripadvisor
../results/Replacement Dictionary
What is the sample on the site?
Should I have a dictionary? How? From where?
Thankful
hi
please please helppp me...
I really need help
please help me
Hi,
please try Read Exel, Nominal to Text and Data to Documents. That should create a collection which can be handled by Loop Collection.
Best,
Martin
Dortmund, Germany
I used this way, it has an error
help
@khazan that error means no data is being passed out of the Loop Collection. I would put a breakpoint in on the Loop Files operator and see if data is reaching the Loop Collection.
I used but did not have any results
Please, if there is an alternate operator for loop file, guide me.
Please tell me how to use word2vec for twitter data? Thanks
I beg you to help me
I need a lot to help with this.
hello @khazan Some quick recommendations for you:
• Post your XML process here in this thread (see https://youtu.be/KkgB5QXWXJ8 and "Read Before Posting" on right when you reply)
• Attach your dataset if possible (use a fictionalized version if there are privacy concerns)
• Make sure you have all necessary extensions installed (see https://youtu.be/pjBqG3xtXx4)
Scott