The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Recombine predicted records with original dataset

NdukuzempiNdukuzempi Member Posts: 1 Learner II
edited November 2018 in Help

How do I recombine the predicted data with the original data, my reason for this is that I want to verify the misclassifications. I have extracted features from the original features and now I need to refer back to the old features which were filtered during training.

 

Tagged:

Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Your question is a bit unclear---if you post your process, it may be easier to diagnose what you are doing.

    If you are using cross-validation for your model building, then you should already have the combined dataset you are referencing---with both your predictions, the confidences, and the original labels, plus all the attributes used for scoring---available from the "test" output for review.  You just need to connect that from the inner process on the "Testing" side so you can output it.  

    If you split your dataset earlier (for whatever reason) and you need to recombine it, you can do that with the "Join" operator, as long as you have an index attribute.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • kypexinkypexin RapidMiner Certified Analyst, Member Posts: 291 Unicorn

    I often face the same issue and this process would give you a general idea of how that can be achived: 

     

    Screenshot 2017-11-06 17.43.42.png

     

    Just create an ID (is there were no ID) and multiply the initial dataset before doing any modelling work. At the end, make inner join (use ID as a key) of the initial dataset and the labeled dataset which is the output of the validated and tested model, this way you'll get back all the old features into new labeled dataset.

Sign In or Register to comment.