Weighting and nominal attributes
Hello there rapidminers
Small question on the process which is used for weighting attributes within auto model process.
It has a section which processes nominals, namely, performs dummy coding:
The question is, for what reason this is done specifically for weighting process?
How then one should interpret weighting results?
For example, here are results from IP traffic classification, with and without dummy coding; as one can see, for binominal categories weights are exactly the same in values, but how to interpret certain chosen values included in the first case (all false except for cat_spam = true)?
Weights with dummy codingWeights without dummy coding
(kindly tagging @IngoRM)
Answers
Hey,
keep in might that these weights are pearsons rho's. So you can't throw this method on nominals and need to do the conversion to dummy coding.
Cheers!
Martin
Dortmund, Germany
Aah yes exactly
Still, there's a question of interpretation, namely, I struggle with putting into explanation of relation between these true/false values and label. Does in my example 'cat_reputation = false' support or contradict 'label = true'? Or the other way around, based on a rather low correlation value from the corr. matrix (0.099), it is just 'the most important predictor' among others, while still quite weak?
Vladimir
http://whatthefraud.wtf
Hi,
i think it should support it, if i got it right. But it's normalized to 1, so it's all relative to the highest influence factor
Best,
Martin
Dortmund, Germany
tagging @IngoRM if he's available...