Combine Datasets with more than one 'label' attribute but retains both 'label' attribute
Hi!
I was wondering whether Rapidminer has an operator that can combine two datasets with two different 'label' attributes but retains both 'label' attribute in the new dataset. Most blending operators only retain the 'label' attribute of the first dataset. I have an ID for both data sets that are the same.
My data looks like this:
ID Qty A (label) Prediction Qty A
A 2 3
B 4 3.5
I want to combine it with this:
ID Qty B (label) Prediction Qty B
B 7 6.5
A 6 6.5
I want the new dataset to look like this
ID Qty A (label) Prediction Qty A Qty B (label) Prediction Qty B
A 2 3 7 6.5
B 4 3.5 6 6.5
Thank you!
Answers
Hey,
sure you can. Since roles need to be unique, you would need to set the role of your attribute to something like label_1 and label_2 and then join afterwards.
~Martin
Dortmund, Germany
Hi Emily!
If you use the Join operator to combine the two data sets by ID, RapidMiner will remove the second label attribute because label is a special role, i.e. can only be assigned to one attribute. If you want to keep the second label attribute, you can set it to another sepcial role by using the Set Role operator on the second data set before joining and giving the label the role "label2" for example. The only thing to keep in mind is that label is not only some special role, but a predefined role also, meaning that it is used by most learning operators.
Cheers
Jan