The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Text mining excel using clustering to obtain confusion matrix

brunonbrasilbrunonbrasil Member Posts: 8 Contributor II
edited January 2020 in Help
hi,
I'm new to rapidminer, I need to get the confusion matrix to validate clusters obtained from a text. Did you know how to do this?

Best Answer

  • brunonbrasilbrunonbrasil Member Posts: 8 Contributor II
    Solution Accepted
    Hello
    I think the solution is this:

Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Clusters are a form of unsupervised machine learning so it is not possible to generate a confusion matrix directly from clustering.  You would first need to turn the clusters into a label and then have another process to assign the clusters to compare the two outputs.  Or if you already have another existing label with the same number of categories as clusters, then you can use the Map Clusters on Labels operator to do this automatically and then use a normal Performance operator to generate the confusion matrix.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • brunonbrasilbrunonbrasil Member Posts: 8 Contributor II
    I built this model to classify the confusion matrix. I managed to get the confusion matrix but I don't know if it is the correct form. Does it make sense to you?




  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    hi @brunonbrasil so based on that screenshot you are using a very old version of RapidMiner Studio. I would highly recommend updating to the most recent version (9.5.1).

    Scott
  • brunonbrasilbrunonbrasil Member Posts: 8 Contributor II
    The context I consider as a label, means the clusters that I obtain manually and compare with the clusters that I intend to obtain. The Receiver represents the data in sentences.

Sign In or Register to comment.