Keeping attribute selection after optimal selection
I am using the Optimise Selection operator with a RVM operator to select from 250 attributes. It works really well and keeps just 85 attributes and produces a nice model. In the production process I am developing I want to skip the optimise selection step and just select the attributes as the database is read to run through the model. I thought to do this I could simply attach a Select Attributes operator after the selection operator and then move this up the process. However The select attributes is not populated with attribute names from which to select the 85 I want. I cannot run the process to force the data through as it won't run with out any selected variables. I think maybe I am not seeing a simple solution!
Answers
Did you output the weight vector from the Optimize Selection operator? This supplies an example set with all the names of the attributes and a 0/1 weight indicating whether they were selected or not. You should be able to store this in your repository and then retrieve it to use to select the attributes that you want in some other process without actually having to rerun the optimize weights operator.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Thanks for the reply. In the end I just wrote out the example set with the selected attributes. But this is pretty clunky, having to write to the repository to retrieve attribute names from an operator. This is something that needs upgrading in RM.
Hi Jeremy,
are you aware of the Select by Weights Operators?
~martin
Dortmund, Germany
What @Telcontar120 and @mschmitz said. You will need to out put the Wei port and use a Select by Weights operator to automatically select the attributes with a Weight of 1 from the optimization process.