The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"How to evaluate feature weighting ?"
Hi RapidMiner team,
Its Phong from UniGE (e-LICO partner).
I am trying to model a feature weighting (say ReliefF) where I want to do a dimensionality reduction (like top-k), then learn a classification model, and validate it on a test set. All of these should be done in a 10CV.
So, I have tried the standard XValidation operator, modeling in the training phase, the feature weighting + dim. reduction and the classification model, then pass everything to the testing phase, to apply the model and validate it.
However, I encountered a strange thing; after the first fold (ie after the first dim. reduction), the second fold present me a train set, WHICH IS ALREADY REDUCED !! And it continues with the rest of folds... It seems to be a data memory error, or am I wrong ?
Then, I have tried the Wrapper-XValidation operator, which seems to be here especially for feature weighting / selection, but by default, after the attribute weighting phase, when the operator builds the learner, it seems that the training set is reduced automatically by removing features that have zero weight... So how can I specify that I want to apply another rule like top-k ?
Hope that my questions are clear enough..
Thanks for your help.
Cheers
Phong
Its Phong from UniGE (e-LICO partner).
I am trying to model a feature weighting (say ReliefF) where I want to do a dimensionality reduction (like top-k), then learn a classification model, and validate it on a test set. All of these should be done in a 10CV.
So, I have tried the standard XValidation operator, modeling in the training phase, the feature weighting + dim. reduction and the classification model, then pass everything to the testing phase, to apply the model and validate it.
However, I encountered a strange thing; after the first fold (ie after the first dim. reduction), the second fold present me a train set, WHICH IS ALREADY REDUCED !! And it continues with the rest of folds... It seems to be a data memory error, or am I wrong ?
Then, I have tried the Wrapper-XValidation operator, which seems to be here especially for feature weighting / selection, but by default, after the attribute weighting phase, when the operator builds the learner, it seems that the training set is reduced automatically by removing features that have zero weight... So how can I specify that I want to apply another rule like top-k ?
Hope that my questions are clear enough..
Thanks for your help.
Cheers
Phong
Tagged:
0
Answers
thank you for this detailed description. I will see, what's happening there and write back.
Greetings,
Sebastian
I've been running into the same problem as Phong. For the old RM 4.x I helped myself with rewriting AttributeWeightSelection an cloning the exampleset. (dirty workaround and memory hogging). As I haven't yet ported it to RM 5.0 I wonder if there exists an operator ( / hack) which allows to set all but the top-k weights to zero, so that scoring functions can be used inside a wrapper-xv?
Greetings
Ben
sorry for getting back to this topic after taking a look into this matter. The reason for this behavior have been removed, there was simply a missing clone() call in the validation. So there should be no need for hacking as long as you don't really apply the weighting which would change the underlying data. Before doing this, you must make a fully copy. Unfortunately we didn't manage to include a view concept for the weight calculation, yet, although this should be relatively easy. You might make a feature request for this in the bugtracker.
Greetings,
Sebastian