The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to improve our accuracy?
Hi guys,
For an assignment we are trying to predict churn for a mobile phone company.
Attached you are able to find the dataset and current processes. We have tried a lot of things, but accuracy seems to be stuck around 70%. Is there anything we are missing, or can we improve accuracy in any way?
Next to that any more tips for predictive models? For example, would clustering make sense in order to better explain the model?
Cheers!
For an assignment we are trying to predict churn for a mobile phone company.
Attached you are able to find the dataset and current processes. We have tried a lot of things, but accuracy seems to be stuck around 70%. Is there anything we are missing, or can we improve accuracy in any way?
Next to that any more tips for predictive models? For example, would clustering make sense in order to better explain the model?
Cheers!
Tagged:
0
Best Answers
-
varunm1 Member Posts: 1,207 UnicornHello @szwanencan we improve accuracy in any way?I see you didn't use "optimize hyperparameters" to search for the best combination of hyperparameters good for your data. As you are using the GBT model, try optimizing on "number of trees, learning rate".Next to that any more tips for predictive models?I am not sure if you tried other models, you also try simpler models like Decision trees and complex neural nets using optimize parameter option.For example, would clustering make sense in order to better explain the model?GBT is complex, most of the explanations are based on features, you can see global importance of feature provided by GBT, you can also use "Explain prediction" to explain local (each prediction) and global features importances. Clustering is also a good way to understand data.Regards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
6 -
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 UnicornHello @szwanen ,
First, I encourage to follow Varun's advices.
Secondly, I played with your data with AutoModel. Here are the results with Feature Selection AND Feature Generation enabled :
In deed it will be difficult to do better than 70% accuracy.
But by doing feature selection ,for an equivalent accuracy, you can significantly reduce the complexity of your model. For example here I'm obtaining an accuracy of 68,6 % with only 7 attributes (on a total of 11 initial attributes).
In my case, I have set the maximum duration for processing duration to 60 min.
To do better than me, you could relaunch AutoModel with your data by setting the max duration of processing to a significantly higher
value (for example 10hours) and launch RapidMiner during a whole night....
Hope this helps,
Regards,
Lionel
4