The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Textmining Problem - Keyword search and customized tokenization

MasseAlarmMasseAlarm Member Posts: 1 Learner I
Dear Rapidminer Community,

for a university project I have to evaluate about 900 business reports and I want to do this via Rapidminer. Unfortunately I'm still a complete beginner regarding the software and need your help. 
I have installed the Text Processing Extension for Rapidminer.

The problem:
I need to search the reports for 120 specified keywords. If this word occurs, I must extract an additional 20 words before and after the keyword in order to understand the context.

My current state:
With "Tokenize" I get a sentence output, but how does it work with exactly 20 words before and after the keyword?
With "Filter Tokens (by Content)" I can always get one of the 120 words displayed. But how do I make sure that all 120 words are directly taken into account?

I've been sitting on it for quite a while now and have searched through all kinds of forum entries without a suitable solution so far. I hope you can help me. Thanks a lot!

Best regards

Answers

Sign In or Register to comment.