The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
10 most important words
Ev_Lazarou
Member Posts: 3 Learner I
I face a problem that i have not solved so far:
I am trying to find the most important words from a dataset. How could I do this?
I am trying to find the most important words from a dataset. How could I do this?
Tagged:
0
Best Answer
-
BalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 UnicornHi!
You can get the word list from the Process Documents operator. Here you find statistics for each term in relation to the labels.
You can also do things like selecting the documents with the highest confidence for each class, and searching for the terms with the highest values. (E. g. aggregate, sum, then transpose the table.)
Best regards,
Balázs5
Answers
Hello
Could you please explain your question more?
Thank you
Sara
I uploaded 2 csv files, I preprocessed them (according to an exercise of my university exams), and i cross validate them with 3 algorithms. The last part of the exercise ask us to prepare a graph with which are the 10 most important (not most common) words in fake news (1 csv file) and the 10 most important words in real news (other csv file)
.
I am uploading photos of the processes run so far in order to understand a little bit more about the concept.
I have already found the most important words in entire text using weight by information gain operator and on the other hand I used wordlist to data and I found the document occurancy and total occurancy how can I merge it and see the results?
Where I have to use aggregate and sum?
Thank you!