The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Textmining Problem - Keyword search and customized tokenization
MasseAlarm
Member Posts: 1 Learner I
in Help
Dear Rapidminer Community,
for a university project I have to evaluate about 900 business reports and I want to do this via Rapidminer. Unfortunately I'm still a complete beginner regarding the software and need your help.
I have installed the Text Processing Extension for Rapidminer.
The problem:
I need to search the reports for 120 specified keywords. If this word occurs, I must extract an additional 20 words before and after the keyword in order to understand the context.
My current state:
With "Tokenize" I get a sentence output, but how does it work with exactly 20 words before and after the keyword?
With "Filter Tokens (by Content)" I can always get one of the 120 words displayed. But how do I make sure that all 120 words are directly taken into account?
I've been sitting on it for quite a while now and have searched through all kinds of forum entries without a suitable solution so far. I hope you can help me. Thanks a lot!
Best regards
for a university project I have to evaluate about 900 business reports and I want to do this via Rapidminer. Unfortunately I'm still a complete beginner regarding the software and need your help.
I have installed the Text Processing Extension for Rapidminer.
The problem:
I need to search the reports for 120 specified keywords. If this word occurs, I must extract an additional 20 words before and after the keyword in order to understand the context.
My current state:
With "Tokenize" I get a sentence output, but how does it work with exactly 20 words before and after the keyword?
With "Filter Tokens (by Content)" I can always get one of the 120 words displayed. But how do I make sure that all 120 words are directly taken into account?
I've been sitting on it for quite a while now and have searched through all kinds of forum entries without a suitable solution so far. I hope you can help me. Thanks a lot!
Best regards
0
Answers
https://academy.rapidminer.com/learning-paths/get-started-with-rapidminer-and-machine-learning
https://academy.rapidminer.com/courses/text-and-web-mining-with-rapidminer
Scott