The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Coreference resolution with RapidMiner: how to begin?

andreasandreas Member Posts: 1 Learner III
edited November 2018 in Help
Dear All,

I was playing with RM for some time, but it's time to do something real now – and I don't quite know how to proceed. The task is direct nominal coreference resolution, i.e. clustering together sets of mentions from the text given a series of documents with properly clustered mentions.

To make it as simple as possible, I guess we can exclude text processing from the whole process and have the data represented as a table with tokens in rows and attributes in columns (attributes containing the usual properties, starting with gender, number – up to some more complex ones).

Issue 1: does such representation make sense? How can we represent different documents (with another attribute, doc number?) and clusters (with cluster number?) How validation should be organized? If we have documents as samples, not just tokens, how should the clusters be represented? Please advise.

Issue 2: how should the process be organized to make it work? Can you suggest anything?

Best,
Andreas
Sign In or Register to comment.