The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Stem (dictionary) for greek language
Hello to the community of rapidminer,
i'm trying to create a stemmer for greek language but i can't implement a more general rule for removing punctuations. For example i want words like "fishes","fished","fishing","fishery" to be reduced to "fish". Due to the wide range of punctuations in greek language is too dificult to map every possible punctuation with the origin of the word. So i tried a rule like this:
fish:fish.*
but it didn't work out. Is there any way to do that ?
thank you in advance
Tagged:
0
Answers
That should work, can you post your process?
after Process Documents from Data the WordList (Process Documents from Data) result window is empty so it can't continue to the Validation procedure because it hasn't any attribute. I've tried the same process with and without the stem (dictionary) and the problem is with the stemmer of greek words.
The way you have it should work. I wonder if there's a bug in those stemmer/stopword dictionary operators because they ask you enter the file path and name to the txt file.
Try it with an Open File operator attached to them and let's see if that works.
with your solution the process bypasses the previous error that i mention. But still the stemmer doesn't work no matter what rule i give. Is there any way to implemet a python based stemmer as an rapidminer operator?
Should be possible but might require some work on your side of course. You need to install the Python extension from the Marketplace (https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_python_scripting) and then implement the function yourself.
Cheers,
Ingo
ok. Thank you i'll give it a try!
Hello, any news with your work?