The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Filter Stopwords with Regular Expression
Hi guys,
I'm currently doing a sentiment analysis in Rapidminer with Knn. I want to count the number of words that are left in the document when removing stopwords. Using the "Filter stopwords" operator inside the "process documents from data operator" only works if I tokenize the data and use the "Nominal to Text" operator first. The issue here is that the output then is as in the image below. I want to be able to count the words that are left after removing the stopwords, so I wonder if there is maybe a regular expression which could be used inside a "Replace" operator or so, to only remove the stopwords without tokenizing it.
Cheers!
I'm currently doing a sentiment analysis in Rapidminer with Knn. I want to count the number of words that are left in the document when removing stopwords. Using the "Filter stopwords" operator inside the "process documents from data operator" only works if I tokenize the data and use the "Nominal to Text" operator first. The issue here is that the output then is as in the image below. I want to be able to count the words that are left after removing the stopwords, so I wonder if there is maybe a regular expression which could be used inside a "Replace" operator or so, to only remove the stopwords without tokenizing it.
Cheers!
Tagged:
0
Answers