Reverse map a nominal to numerical transform

labbronx · June 2017

I am using K-means to cluster the data. To do so, I have transformed my nominal values into numerical ones using the Nominal to Numerical operator, but using the coding type parameter set to "unique integers." How do I reverse this transformation so on output I can see what these values were in the clusters before they were transformed. For example, if "sandwich" gets mapped to 0, I would like to reverse map 0 back to sandwich.

FBT · June 2017

It may not be the most elegant solution, but what you could do is the following:

Multiply your example set prior to the type conversation. Connect the first output of the multiply operator to your current process, after which you add a join operator and connect the resulting example set to the left port. Connect the second output of multiply to the right port of the join.

You will need an id on which to make the join and you may want to make some pre-processing (renaming attributes, etc.).

Thomas_Ott · June 2017

That's how I usually handled it.

labbronx · June 2017

Thanks that works. Would have never thought of it.

Telcontar120 · June 2017

Be very careful with "unique integers" mapping if your nominal categories are not inherently ordinal. For example, if you have sandwich, bread, and butter mapped as 1, 2, and 3, then k-means thinks that the distance between 1 and 3 is larger than the distance between 1 and 2 or 2 and 3. But for non-ordered categories, this doesn't make any sense and can lead to strange and distorted results when clustering. If your nominal categories are not ordered, you are better off with numerical dummy coding or simply using mixed Euclidean distance (which assumes a distance of 1 between all nominal values that are not the same, precisely to avoid this problem).

labbronx · June 2017

thanks. I originally used dummy coding, but it blows up the record, as I have lots of unordered nominal values. I will try using mixed Euclidean distance. How does one use this?

Thomas_Ott · June 2017

You could use effect code too, assuming your don't have too many nominal values per attribute.

labbronx · June 2017

Never mind, I figured out how to use mixed Euclidean distance

laavila · December 2018

I have this problem too. I've tried with the proposed solution, with the multiply operator, but the final result I've got is just the exampleset with unique integers values (I don't understand very well the data with this values on it). I have even generate an id attribute prior to the multiply operator and after all the process, I used the join operator too. I couldn't get the nominal values again. Anyone have an idea what I am doing wrong?

Thanks!

sgenzer · December 2018

hi @laavila sorry this is an old thread. Can you please post your process XML so we can see what you're doing? Scott

jm_echeverria40 · May 2020

Hello all,

¿Is there any current accepted solution in the latest version of the program?
¿How can be do this in 2020?
¿Does the same mentioned methodology work?

If possible please provide the diagram!

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Reverse map a nominal to numerical transform

Best Answer

Answers