The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Changing the attribute to be predicted
I've only been working with RapidMiner for a short amount of time, so this may be a simple question to answer -
When modeling using both the Default Model and the k-NN Model, I can not seem to change which attribute is being predicted. For the model I'm attempting to create, I used "Generate Sales Data" and added a "Total Price" attribute (this is all from the example in the User Manual). When I added a model, it only predicted the transaction number, which is only a label that goes from 1 to 100, which means that there's really nothing to predict. I was wondering how to change this so that the model instead predicts a value for a different attribute that makes more sense, such as the total price. These attributes vary within a range so it would make more sense to predict these.
Thanks in advance for any help! I really appreciate it!
When modeling using both the Default Model and the k-NN Model, I can not seem to change which attribute is being predicted. For the model I'm attempting to create, I used "Generate Sales Data" and added a "Total Price" attribute (this is all from the example in the User Manual). When I added a model, it only predicted the transaction number, which is only a label that goes from 1 to 100, which means that there's really nothing to predict. I was wondering how to change this so that the model instead predicts a value for a different attribute that makes more sense, such as the total price. These attributes vary within a range so it would make more sense to predict these.
Thanks in advance for any help! I really appreciate it!
0
Answers
The "Set Role" operator is the one you want. You need to use this to change the role to "label" for the attribute you want to use as the supervised example from which the algorithm can learn a model. Be warned though that the "generate sales data" operator doesn't, as far as I can tell, generate anything apart from random data so any models will fail to predict anything meaningful.
If however, you've created a new attribute such as "total_sales" based on "amount*single_price" and you use this as the label with "amount" and "single_price" as regular attributes then your model will try to predict a value for "total_sales" based on the relationship between the input attributes. An algorithm like neural networks should be able to work out that "total_sales" is indeed "amount*single_price". This is not tremendously helpful but illustrates the point.
For fun I've attached a process that shows what I mean. Plot "total_price" against "prediction(total_price)" and you should get a straight line that shows the prediction is very close indeed to the correct value.
You'll notice that I filter out most of the attributes, the reason is that neural networks can't handle nominal values. You'll find this to be one of the headaches as you learn the product; namely which algorithms handle which attribute types.
regards,
Andrew