The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Determine text similarity?"
Legacy User
Member Posts: 0 Newbie
Hi,
is it possible to use RapidMiner to determine the similarity of two texts (i.e. using cosing similarity)?
I played around with RapidMiner and the text plugin. I managed to create word vectors using TextInput and applied StringTokenizer, EnglishStopwordFilter and PorterStemmer.
But now I'm stuck. How can I compare two text files and determine their similarity?
I'm thankful for any hint!
is it possible to use RapidMiner to determine the similarity of two texts (i.e. using cosing similarity)?
I played around with RapidMiner and the text plugin. I managed to create word vectors using TextInput and applied StringTokenizer, EnglishStopwordFilter and PorterStemmer.
But now I'm stuck. How can I compare two text files and determine their similarity?
I'm thankful for any hint!
Tagged:
0
Answers
did you try out the operator ExampleSet2Similarity? If you search for "similarity" in the field below the operator groups in the "New Operator" tab or in the text field of the "New Operator" dialog, this (and other similarity related) operator should come up...
Cheers,
Ingo
thanks for your answer.
I tried that, but will give me an illegal argument exception: null or zero length argument @ ExampleSet2Similarity
Howerver, within DataStatistics, I have output which looks like this: My process looks like this:
hmm, I just tried that myself with a data set delivered together with the Text plugin and everythings seems to work normally. Here is the process: Cheers,
Ingo