The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Applying prediction model to numerical values
akselerator
Member Posts: 3 Learner I
Hi Rapid Miner Community
New here, so this is my first question. Hope to take part in this awesome community!
I'm trying to dig a bit further into predicting/understand the causes of cost escalation in my job. My problem is a bit in line with the Titanic prediction excercise.
Now to the problem:
I have a data set containing categorized cost overruns in the transport portfolio (think transporting huge vessels) of my company and relevant variables that could explain why these overruns happen (POD/POL/Destination/type/size etc). The problem is that rather than being Cost overrun=Yes/No, it is a numerical value that represents the size/severity of the overrun, and I cannot comprehend how to create a prediction model that considers this. In addition, I would like to get an output that explains why the model predicts what it does so that I can make sure to eliminate these mistakes.
Thanks to anyone taking their time to help me!
Edit: I only have data for about 65 projects right now. The purpose is to build it and keep feeding it information as projects finish. Cannot go further back in time. This means that AutoModel does not work.
Kind regards
Aksel
New here, so this is my first question. Hope to take part in this awesome community!
I'm trying to dig a bit further into predicting/understand the causes of cost escalation in my job. My problem is a bit in line with the Titanic prediction excercise.
Now to the problem:
I have a data set containing categorized cost overruns in the transport portfolio (think transporting huge vessels) of my company and relevant variables that could explain why these overruns happen (POD/POL/Destination/type/size etc). The problem is that rather than being Cost overrun=Yes/No, it is a numerical value that represents the size/severity of the overrun, and I cannot comprehend how to create a prediction model that considers this. In addition, I would like to get an output that explains why the model predicts what it does so that I can make sure to eliminate these mistakes.
Thanks to anyone taking their time to help me!
Edit: I only have data for about 65 projects right now. The purpose is to build it and keep feeding it information as projects finish. Cannot go further back in time. This means that AutoModel does not work.
Kind regards
Aksel
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data ScientistHi,you can do a few things:
- You can build a regression problem and predict the amount of overrun.
- you can do a classification problem and then define a own performance metric as average \sum OverRunCostsCaptured .
- You can use the costs as a weight in your analysis
possibly even more things.Cheers,Martin- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany0 - You can build a regression problem and predict the amount of overrun.