Hello i have a homework of rapidminer, can anybody help me?
You will use the following process:
1. Based on the training dataset,create a training sample and a validation sample by splitting the data into 2 groups. Steps 2-5 below will then be performed on the training and the validation data.
2. Set up the dependent variable
-
Make a preliminary assessment of the relative importance of the explanatory variables using visualization tools and simple descriptive statistics.
-
Estimate the classification model using the training data,and interpret the results.
-
Assess the accuracy of classification with the validation sample, possibly repeating steps 2-5 a few times changing the classifier in different ways to increase performance.
-
Finally, score each observation of the scoring dataset and determine the list of applicants with a good credit risk (probability equal of higher than to 0.80) that the marketing department will be able to contact.
Answers