The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Top Down Clustering - determining Item Number of lower Level Clusters"
tiramisusann
Member Posts: 9 Contributor II
I'm using Rapid Miner in order to complete the task of my Master Thesis. For that I have to cluster a huge amount of textual data with the goal to identify the most similar document of the database to an incoming piece of document.
For that I need to define a top down clustering. In the lowest level it should contain clusters with only ONE document (otherwise it would be not possible to find the most similar document). The incoming document should follow the path which it is most similar to by comparing the centroid vectors of the clusters with the document vector. Applying that algorithm it will terminate at the cluster containing the most similar document.
But how could I implement that idea in Rapid Miner? I have no clue how to tell Rapid Miner, that Clusters of the last and lowest level only should contain one single document.
I would be very very grateful, if anyone could help.
Thanks so much,
tiramisusann
Tagged:
0
Answers
Happy Mining!
~Marius