The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Stop Word and Stemming List / Dictionary
Dear All...
I've been using RapidMiner for quite some time, especially for the text mining function. I have difficulty in retrieving the stop word list and stemming (snowball), both for English. The list would help me in updating the content and increase the preciseness of my text mining process. I do really hope if anybody could share with me these lists (stop word and stemming) or at least let me know where/how I can find these lists. Your kind assistance is highly appreciated.
Thanks.
Tagged:
0
Answers
There is a lot of more detailed information available about the snowball project here: https://snowballstem.org/algorithms/
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
hello @arizah78 -
Just to add a bit...there was a similar request in another thread for the Arabic stopword list and I'm looking into it. The lists are easy to access; we just want to make sure that we're allowed to (the extension is not open-source and hence the author of the list has copyright ownership by default). I will let folks here on the community know when I get this answered.
Thanks for understanding.
Scott
Hi Scott,
Thanks for your update.
Really hope to get a positive feedback soon.
Thanks. Appreciate the link sharing.
hello @arizah78 - I have the code to the extension (which contains the wordlists) and it is indeed open source. I will send you the file via PM.
Scott