The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Relative Contribution of Variables
I want to measure the relative contribution of each input variable to the prediction power/accuracy of a model (any classification or regression model). In some commercial tools like SPSS Modeler this is done automatically by a process so called leave-one-out. In each iteration one input variable is left out of the modeling and the model is tested on holdout sample (or via x-validation), the accuracy is recorded (e.g., variable left out = A, accuracy 82%). This process is repeated for each input variable. At the end you have a list of accuracies for each variable's-absence from the model. The lower the accuracy, the higher the contribution/importance of the variable that is left out. Once done, this accuracies can be converted/inversed into relative importance measures (can also be normalized), and shown using a horizontal bar chart illustrating the relative contribution of all variables.
I tried to do this in RapidMiner 7.0 with Loop Attributes note. It did not work! I could not set it up properly because I am not all that familiar with RapidMiner procedures like loop operators. The short descriptions were not sufficient enough for me to understand and use them properly for this process.
Can anyone create a simple process for a small data set like Golf and Decision Trees and X-Validation for the variable contribution procedure I described, and post it here so that we all can learn/benefit from it?
Thank you.
I tried to do this in RapidMiner 7.0 with Loop Attributes note. It did not work! I could not set it up properly because I am not all that familiar with RapidMiner procedures like loop operators. The short descriptions were not sufficient enough for me to understand and use them properly for this process.
Can anyone create a simple process for a small data set like Golf and Decision Trees and X-Validation for the variable contribution procedure I described, and post it here so that we all can learn/benefit from it?
Thank you.
0
Answers