The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

prediction modeling for text analysis

lambamanika07lambamanika07 Member Posts: 24 Maven
edited November 2018 in Help

I am trying to perform a prediction modeling of text resources. I chose 272 training resource and 116 as test ones. But only 190 from the training ones and 80 from the test ones got modeled and results about their accuracy, precision and recalls values were shown. But I want to get those results for all the data. Please help.

Best Answers

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi @lambamanika07

     

    I don't understand exactly what you want to do and what you performed exactly.

     Your training and test dataset are both labeled ?

     

    But given the information given, I suggest you to perform a cross validation with your 272 training ressources to build a model ==>

    you will have the performance (accuracy, recall, precision) of your model based on your 272 training ressources.

     

    and then to apply this model to your 116 (labeled ?) test ressources with a performance operator. =>So you can measure the performance

    of your builded model on "unseen" data. THe process looks like this : 

    text_training_test_data.png

     

    or

     

    you can perform a cross validation with your 388 ressources (272 training + 116 test) to build a better model ==> 

    you will have the performance (accuracy, recall, precision) of your model based on your 388 "training" ressources.

    and you can apply this model to future unseen data. The process looks like that : 

    text_training_test_data_2.png

     

     

     

     

    For a better response, can you share your process and your data source, please ?

     

    Regards,

     

    Lionel

     

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi @lambamanika07 again,

     

    to complete my response,  the sub-process cross validation looks like that : 

    text_training_test_data_3.png

     

    Regards,

     

    Lionel

     

     

     

     

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Are you using Cross Validation? Post your proess using the < / > option. 

  • lambamanika07lambamanika07 Member Posts: 24 Maven
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @lambamanika07 i would not build a text classification model as you've shown. I would do it like @lionelderkrikor shows. Also, if the LinearSVM doesn't show good results, I would try a Naive Bayes and/or Deep Learning. You could even use a Stacking or Voting operator. 

Sign In or Register to comment.