The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
filter stopword operator's result
Mohamad1367
Member Posts: 22 Learner III
hi.i want to see the result of filter stop words in my data set after applying this operator to my data set but i recieve collection of documents in result view...i put this operator(filter stop word) inside Loop collection operator...what do i do to solve this problem?
Tagged:
0
Best Answer
-
sara20 Member Posts: 110 Unicorn@Mohamad1367
Hello
Look at the screen please then according to that first you should download the rmp file then import it to your RM.
I hope this helps
sara2
Answers
It is the normal behavior. You have to select an element of the collection to see for this selected document the results after applying Filter Stopwords operators.
But I guess that your final goal is not just to see your document after applying Filter Stopwords operator .. right ?
So it would be more useful to share your data (a priori the example set called "test") and describe explicitly what you want to do in fine.
This way we could help you more efficiently...
Regards,
Lionel
Unfortunately, I'm not aware of a Stopwords Filters, steaming operators etc. for Persian in Rosette extension or in RapidMiner.
You could take a look at this text processing Python extension :
https://github.com/sobhe/hazm
Regards,
Lionel
Hello
There is some good posts about persian text mining also there is a stop word for that in RM. I recommend you to search in community. You can find alot of useful posts for that.
Best regards
sara
@sara20 is right, you have resources for Persian text processing including stopwords dictionnary (Sorry for my previous post, I have not checked it in the community site... )
In particular look at this thread including a @sgenzer post which explains where to find a dictionary for Persian stopwords :
https://community.rapidminer.com/discussion/55674/persian-dictionary
Hope this helps,
Regards,
Lionel
According to your dataset, I think I understood what you want to achieve : You want in fine create a model to do sentiment classification ? right ?
In this case, you will need the Process Document from Data operator and put all your text processing steps (Tokenize (again), Filter stopwords) INSIDE this operator.
Please check the process in attached file. You will see in exit of this process a word vector with the Stopwords (Persian) filtered.(Don't forget to set the path where your dictionary file for the stopwords is stored...)
From this starting point , you can create a model to perform sentiment classification, by adding a Set Role and a model (a classifier) of your choice after the Process Document from Data operator .
hope this helps,
Regards,
Lionel