The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Can a groovy script count clusters?"

awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
edited May 2019 in Help
Hello all,

The Cluster Count Performance operator returns very odd values. I decided to look at the code to see what was going on and I noticed these lines in the file 'ClusterNumberEvaluator.java' at about line 90

for (int i = 0; i < model.getNumberOfClusters(); i++)
          numItems = +model.getCluster(i).getNumberOfExamples();
numitems is set to one more than the number of examples in the last cluster.

This gets used later in this line

PerformanceCriterion pc = new EstimatedPerformance("Number of clusters", 1.0 - (((double) model.getNumberOfClusters()) / ((double) numItems)), 1, false);
So leads to weird values. I think  numItems += model should fix it.

Anyway my question, before I embark on it, will it be possible to use the Groovy scrting operator to calcuate this myself?

regards

Andrew
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    ok, that's a typo causing a lot of headache :) I corrected this by removing the blank between = and +...

    Additionally I have changed the behavior of the operator so that it now returns two criterions one containing the actual number.

    And of course you can do this with the Grovy Scripting operator.

    Greetings,
      Sebastian
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello Sebastian

    If you say it can be done with Groovy then I'll try it.

    regards

    Andrew
Sign In or Register to comment.