New Logistic Regression Operator: Strange Behavior
Typically, in the absence of knowledge about the relative cost of missclassification errors a classifier shoud classify an observation as a member of the "True Class" if Probability(True) > 0.5. That's the behavior of most classifiers in Rapidminer (including W-Logistic).
The new classifier "Logistic Regression" seems to be the exception. This classifier classifies an observation as True if Prob(True) > 0.3 (or in the Rapidminer terminology : if Confidence(True) > 0.3). I'm attaching a process showings this behavior. Just run it. Plot a histogram of Confidence(True) and color it using the variable Prediction(label).
The pic of the histogram is attached to this message too.
Best Answer
-
Thomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
I tested that too, the other RapidMiner/Weka operators do operate as they should. Based on the H2O documentation, I think it's the F1 optimzation but will confirm.
0
Answers
In the sample process you attached, you use a deep learning operator inside the CV. Is this correct?
No. I used the new LogisticRegression operator. I didn't even use cross-validation.
The problem seems to be the GeneralizedLinearRegression routine. I exchanged operator (GLM for Logistic Regression) with the right settings (family=binomial, etc) and I get the same behavior.
I see what you're saying. Hmm, let me investigate.
That's very curious. Did you try comparing the results of the Weka version of the logistic regression operator?
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
@yyhuang pointed out to me that it might be related to H2O's f1 optimization of binomal data sets for the GLM algo. http://ethen8181.github.io/machine-learning/h2o/h2o_glm/h2o_glm.html
Will continue to investigate.
@Telcontar120 I tested this out using the Weka LR and the old Rapidminer SVM LR algo, both give me a label flip at confidence > 0.5 when using a Generate Data operator set to Random Classification.
I think I'm learning toward the internal F1 measure optimization that H20 is doing behind the scenes for binomal labels, but we're looking into this.
Thanks Thomas. I should add that if you use the Create Threshold and set it to 0.5 it works fine.
The operator W-Logistic works fine as do the other classifiers in Rapidminer.
Thomas:
A quick entry to confirm that you were right. H2o chooses the predicted class based on the maximum-F1 threshold. From the User Guide (Generalized LInear Modeling with H2O and R) page 26.