The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
K-means on CSV file
Hello everyone.
I have the following a csv file containing blogposts including author name, date posted etc.
Now I want to apply K-means clustering to the blog's content. I try to use the Rapidminer text tool to apply tf-idf vectorisation. However I can't figure out how to apply the tf-idf to every blog in the csv file. Any suggestions?
Cheers!Â
I have the following a csv file containing blogposts including author name, date posted etc.
Now I want to apply K-means clustering to the blog's content. I try to use the Rapidminer text tool to apply tf-idf vectorisation. However I can't figure out how to apply the tf-idf to every blog in the csv file. Any suggestions?
Cheers!Â
Tagged:
0
Answers
you need TF-IDF only if you have the actual contents of the blog, i.e. text. In this case you can find some useful video tutorials on text mining here: http://vancouverdata.blogspot.de/2010/11/text-analytics-with-rapidminer-loading.html
I would first focus on the text and add the other attributes like author and date later on. If you need help feel free to come back to this forum.
Best regards,
Marius