K-Means cluster validation performance at zero percent??? What am I doing wrong?
I finally was able to produce a confusion/classification matrix for evaluating my model, however, it is showing a zero percent predictive accuracy and I cannot figure out why.
My process:
Cleaned data
Chose variables based on correlation to target variable (bankruptcy)
Normalized all variables
Dropped in X-Validation Operator
-Contains K-means (k=2) cluster model in training section
-contains apply model and Performance (Classification) operator with "accuracy" as the main criterion
The variable Bankrupcty is marked as nominal Prediction.
Accuracy is zero!!!
This assignment is due tonight and so far I am not able to evaluate my model performance.
I have attached my data and process
Answers
A few things you are attempting to Cluster which is a non-supervised method. Cross Validation is used for supervised training, where you need a training label.
So I'm not sure what you want to do? Cluster or do supervised training?
If you want to do supervised training with a Cross Validation operator, you shouldn't use the Clustering algorithm. Since the values in Bankruptcy are numerical, you could use a Linear Regression algoritm to evaluate. You will need a Set Role operator to set the Brankruptcy column as a label.
If you just want to just run cluster analysus you don't need a Set Role operator and you can just run it without the label.
And the Samples directory has a cluter model being converted to a classification model. Maybe that's what you;re trying to do?
Check out //Samples/07_Clustering/06_ClusterClassificationWithEvaluation.