The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"word vector with additional attributes"
Hi,
I'd like to use 1 CSV-file for classification. In this file there are several rows (ID, title, text, class). Now I want to build the word vector of the text (that works) and combine it with the weighted title. Something like 1/3 text and 2/3 title. Afterwards this should be used for classification, while the classes are also given in this one CSV-file.
· So how can I combine these two rows and weight them? (I also don't understand why its not enough to say these rows are of type text, but in addition having to convert them from nominal to text)
· How can I use the class as the label? (Giving the special attribute label seems not to be enough, have to set role.)
I'd be very glad, if you could give me a hint, maybe naming the operators I should try or giving the order of operations.
THX a lot,
dali.
I'd like to use 1 CSV-file for classification. In this file there are several rows (ID, title, text, class). Now I want to build the word vector of the text (that works) and combine it with the weighted title. Something like 1/3 text and 2/3 title. Afterwards this should be used for classification, while the classes are also given in this one CSV-file.
· So how can I combine these two rows and weight them? (I also don't understand why its not enough to say these rows are of type text, but in addition having to convert them from nominal to text)
· How can I use the class as the label? (Giving the special attribute label seems not to be enough, have to set role.)
I'd be very glad, if you could give me a hint, maybe naming the operators I should try or giving the order of operations.
THX a lot,
dali.
Tagged:
0
Answers
use the process documents from data operator and set the "specify weights" checkbox
i have a video series on text mining with RM here: http://www.youtube.com/user/VancouverData