Most relevant attribute within the classification of a specific instance?

marcopo · September 2015

Hello,
Does anyone know a way to identify the most relevant attribute within the classification of a specific instance?

Of course, it is possible to see the most important attributes for the model itself. But the values of the attributes varies and perhaps the most important attribute for the model is not the most important for the classification of a specific instance.

Thanks a lot

Marco

MartinLiebig · September 2015

Hi marco,

to be honest i think this is hardly possible for a generic model. It might be possible for some models, but it is very hard to judge on individual relenvance (however we define that) for a true multivariate method.
If you have a like: If 50 < Age < 79 && Gender=="male" && TransactionValue > 100 from a decision tree - What would you assign as relevance? In the end the combination of it made the result..

Tough not. Maybe you can get it from some models, but definitly nut for all.

~Martin

marcopo · October 2015

Thank you Martin, you are right. But a regression model should work in most of the cases until the regression coefficient is not too big. The attribute with the biggest amount should be the most relevant for the prediction. But how to extract the formula and insert the values from the instance?

MartinLiebig · October 2015

Hi,

do you mean linear regression? In this case you might be right, even though i do not know a way to do this by hearth. The formula is given in the model of linear regression and the coefficients are in the weight vector. So weights to data, join and generate attribute might work?

For a general regression model this is still something hard to do.

~Martin

JEdward · October 2015

Well a logistic regression would give you prediction between 0 & 1 for classification problems and is quite straightforward as a formula (if you use the Weka logistic regression & not the KLR in the RapidMiner core)
And because the formula for each record is pretty simple (weight1 * att1, weight2 * att2,... ) you can turn this into a calculation for each attribute to generate the results using loops. (Other formulae generating models are possible, but once you get over 100 support vectors per attribute you get a bit blurry eyed & error checking is difficult).

Short version: yes, it's possible but you'd need to break the scoring of each model down into the individual parts & really only works well with Weka Logistic Regression.

'I've also used Weight of Evidence tranforms before to generate record scorecards which then (when you generate a logistic regression from it) mean you can see for each example which attribute for a specific instance was the most important to the model.
http://rapid-i.com/rapidforum/index.php/topic,9047.msg30446.html

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Most relevant attribute within the classification of a specific instance?

Answers