The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

dummy coded variables

LaraLara Member Posts: 5 Contributor II
edited November 2018 in Help
Dear Data Mining and Rapid Miner Experts,

I would like to analyse my dataset which contains categorical (polynominal) predictor variables by Logistic Regression and SVM.
So far I used other Data Mining/ Statistic Software that have transformed my categorical predictor variables automatically by dummy coding using one group as a reference group and getting k-1 new binominal dummy coded variables (when having k groups in the considered attribute).
How can I perform this in Rapid Miner?
If I transform an attribute in k-1 binominal variables manually how will the Logistic Regression or SVM operator know that these are my dummy coded variables? Or do I just have to create k new binominal attributes for modelling...?

Thank you very much.

Lara

Answers

  • IngoRMIngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi Lara,

    you are right: RapidMiner does not transform the data automatically but the user has to define what data should be performed in which way. The reason for that is that we believe that the user should be aware of what's happening instead of simply performing some preprocessing which might introduce a lot of bias. So we go for the "manual" way - and combine this with assistants like the new quick fixes introduced in RapidMiner 5 in order to support the user for standard tasks.


    You describe a standard preprocessing subprocess taking nominal (categorical) attributes and introduces binominal dummy attributes before those are transformed to numerical which can be then used by learning schemes like SVM or Logistic Regression. I have uploaded this process with our new Community Extension (available from our Update- and Installation-Server in the Help Menu of RapidMiner). You can download and apply the process "Convert Nominal to Binominal to Numerical" (Website: http://www.myexperiment.org/workflows/1275) with a few clicks after having installed this extension.

    Cheers,
    Ingo
Sign In or Register to comment.