The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
NaN problems with MinMaxNormalization and precision measure
Hi,
I noticed two bugs (?) in the MinMaxNormalization and WeightedMultiClassPerformance classes.
MinMaxNormalization:
If an attribute has always the same value, they are normalized to NaN. Is this normalization behaviour really intended? This can result in strange results from Learning operators since some of them don't handle unkown values well (LibSVM). Here's my proposed fix:
WeightedMultiClassPerformance:
The average precision is NaN if there is a class that is never predicted by a model. The reason is that the precision for this class is NaN. Here's another possible fix:
I noticed two bugs (?) in the MinMaxNormalization and WeightedMultiClassPerformance classes.
MinMaxNormalization:
If an attribute has always the same value, they are normalized to NaN. Is this normalization behaviour really intended? This can result in strange results from Learning operators since some of them don't handle unkown values well (LibSVM). Here's my proposed fix:
### Eclipse Workspace Patch 1.0
#P yale
Index: src/com/rapidminer/operator/preprocessing/normalization/MinMaxNormalizationModel.java
===================================================================
RCS file: /cvsroot/yale/yale/src/com/rapidminer/operator/preprocessing/normalization/MinMaxNormalizationModel.java,v
retrieving revision 1.11
diff -u -r1.11 MinMaxNormalizationModel.java
--- src/com/rapidminer/operator/preprocessing/normalization/MinMaxNormalizationModel.java 14 Jan 2009 13:45:34 -0000 1.11
+++ src/com/rapidminer/operator/preprocessing/normalization/MinMaxNormalizationModel.java 12 Mar 2009 10:56:13 -0000
double value = example.getValue(attribute);
double minA = range.getFirst().doubleValue();
double maxA = range.getSecond().doubleValue();
- example.setValue(attribute, (value - minA) / (maxA - minA) * (max - min) + min);
+ if (maxA == minA || min == max) {
+ example.setValue(attribute, Math.min(Math.max(minA, min), max));
+ } else {
+ example.setValue(attribute, (value - minA) / (maxA - minA) * (max - min) + min);
+ }
}
}
}
WeightedMultiClassPerformance:
The average precision is NaN if there is a class that is never predicted by a model. The reason is that the precision for this class is NaN. Here's another possible fix:
### Eclipse Workspace Patch 1.0
#P yale
Index: src/com/rapidminer/operator/performance/WeightedMultiClassPerformance.java
===================================================================
RCS file: /cvsroot/yale/yale/src/com/rapidminer/operator/performance/WeightedMultiClassPerformance.java,v
retrieving revision 1.6
diff -u -r1.6 WeightedMultiClassPerformance.java
--- src/com/rapidminer/operator/performance/WeightedMultiClassPerformance.java 9 May 2008 19:22:43 -0000 1.6
+++ src/com/rapidminer/operator/performance/WeightedMultiClassPerformance.java 12 Mar 2009 11:02:28 -0000
}
result = 0.0d;
for (int r = 0; r < rowSums.length; r++) {
- result += classWeights * (counter / rowSums);
+ double p = counter / rowSums;
+ result += classWeights * (Double.isNaN(p)? 0 : p) ;
}
result /= weightSum;
return result;
Tagged:
0
Answers
thanks for sending in those fixes. Both seemed very reasonable to me and we just have incorporated them into the latest CVS developer branch. They will of course also be part of the upcoming new release.
Thanks again and cheers,
Ingo