The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
similarity
How are you .. How can I use data to similarity to calculate the similarity of a document with all the lines in a database and choose the most similar
Thank you
Tagged:
0
Best Answer
-
BalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 UnicornHi!
You would use these operators:
- Read Database for getting the data
- Process Documents from Data (from the Text Processing extension) to create a document vector
- A second Read Database (or different data source) for the data to compare
- Process Documents from Data, with the wor (Wordlist) output from the first one connected to the input of this one. This makes sure that the tables have the same structure
- Cross Distances
Then you would select the documents with the smallest distance (= largest similarity).
Regards,
Balázs
1