The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Categorization of text comments

AbilassVAbilassV Member Posts: 1 Learner I
edited November 2021 in Help

Hello Rapidminer Forum

I am doing a smaller project at the university, and I am trying to create a machine learning model to predict categories of a number of text lines.

I have 72 lines of text and I have manually categorized 16 of them into one of two categories (Travelling or Cricket). (The used excel-sheet is attached and a screenshot of it is seen in picture 'ScreenshoutofExcelData'.)

Now trying to make the model predict the rest based on my own categorization. If not possible for some of the text lines it should predict "unknown".

I run into a problem with the SVM (Support Vector Machine) operator giving me error "Insufficient capability" when i put in more than one of the categories in the Filter Examples Operator.

The model used is based on a video from RapidMiner Academy, named: 'Applying a Model to categorize Documents'. Sorry - I am not able to post the link - but a screenshot from the webpage is seen in below picture named 'ScreenshoutofVideoPage'. 

Screenshots of the model are shown in pictures: 'Model1_Part1', 'InsideSubProcess', 'InsideProcessDocuments', 'InsideTraining', and 'Model2_Part2'


I also found an article from a website called 'Monkey Learner', which is attached as a pdf named: 'What is Text Classification?'. On page 3 to 9, it goes through six steps, which is basically what I want to do in RapidMiner, if you have any suggestions to create such a model, please help.


Thanks for taking you time to read and maybe even answer me. :)

Best Regards 

Abilass V.

Sign In or Register to comment.