filter stopword operator's result

Mohamad1367 · May 2020

hi.i want to see the result of filter stop words in my data set after applying this operator to my data set but i recieve collection of documents in result view...i put this operator(filter stop word) inside Loop collection operator...what do i do to solve this problem?

Image: https://us.v-cdn.net/6030995/uploads/editor/qt/bswrcrr1sjj1.png

sara20 · May 2020

@Mohamad1367

Hello

Look at the screen please then according to that first you should download the rmp file then import it to your RM.

Image: https://us.v-cdn.net/6030995/uploads/editor/f6/hliyxilr7i44.png

I hope this helps
sara

lionelderkrikor · May 2020

Hi @Mohamad1367,

It is the normal behavior. You have to select an element of the collection to see for this selected document the results after applying Filter Stopwords operators.
But I guess that your final goal is not just to see your document after applying Filter Stopwords operator .. right ?

So it would be more useful to share your data (a priori the example set called "test") and describe explicitly what you want to do in fine.
This way we could help you more efficiently...

Regards,

Lionel

Mohamad1367 · May 2020

thanks for your response @lionelderkrikor i describe what i want to reach : i have a data set in persian language to do sentiment analysis on it. each row in my data set has a sentiment lable for example lable=5 means that this sentence is very positive

i want to do some text preprocessing steps on it like : tokenization , stop word filtering, steaming ,etc

for tokenization i install rosette extension that supports persian language

i share my data set here... what operators should i use to achieve this goal and sequence of them?

lionelderkrikor · May 2020

@Mohamad1367,

Unfortunately, I'm not aware of a Stopwords Filters, steaming operators etc. for Persian in Rosette extension or in RapidMiner.
You could take a look at this text processing Python extension :

https://github.com/sobhe/hazm

Regards,

Lionel

sara20 · May 2020

@Mohamad1367
Hello

There is some good posts about persian text mining also there is a stop word for that in RM. I recommend you to search in community. You can find alot of useful posts for that.

Best regards
sara

lionelderkrikor · May 2020

@Mohamad1367,

@sara20 is right, you have resources for Persian text processing including stopwords dictionnary (Sorry for my previous post, I have not checked it in the community site...

)
In particular look at this thread including a @sgenzer post which explains where to find a dictionary for Persian stopwords :
https://community.rapidminer.com/discussion/55674/persian-dictionary

Hope this helps,

Regards,

Lionel

sara20 · May 2020

Also there is an other stop words here

Mohamad1367 · May 2020

@sara20 @lionelderkrikor thanks for your respons,, i have stop word dictionary in persian ..i forgott to upload here in previous comment...my problem is when i apply stop word filter operator to my data set i want to see the filtered result in result view but i can't do this

i only for tokenization apply rosette extension for other tasks such as steming , stop word filtering,etc i use text processing extension which is language independent and only needs to a dictionary

lionelderkrikor · May 2020

@Mohamad1367,

According to your dataset, I think I understood what you want to achieve : You want in fine create a model to do sentiment classification ? right ?
In this case, you will need the Process Document from Data operator and put all your text processing steps (Tokenize (again), Filter stopwords) INSIDE this operator.
Please check the process in attached file. You will see in exit of this process a word vector with the Stopwords (Persian) filtered.(Don't forget to set the path where your dictionary file for the stopwords is stored...)
From this starting point , you can create a model to perform sentiment classification, by adding a Set Role and a model (a classifier) of your choice after the Process Document from Data operator .

hope this helps,

Regards,

Lionel

Mohamad1367 · May 2020

thanks for your answer @lionelderkrikor .... I know that this is clear but please explain more which it is atthached, how can i run it?by drag and drop of the attached file to the design view and only connecting that to the result port?

Mohamad1367 · May 2020

@sara20 thank you very much

Mohamad1367 · May 2020

@lionelderkrikor i run the proces that you are attached in previous post but i recieve only tokenized result.. stop words were not filtered...here i attached the screenshot of my result...can you help me please?

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

filter stopword operator's result

Best Answer

Answers