The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Jaccard distance with Rapid Miner
Hello,
First, I want to know if there is jaccard's distance implemented in Rapid miner ? I want to know how to compute the distance between objects/items using Data miner widgets . Also , I don't know what is the data format. Say for example I need to find the distance between the items in the following matrix and rank them how to do it using widgets?
val1 val2 val3 val4
item1 1 0 1 1
item2 0 1 0 1
item3 1 1 1 0
item4 0 1 1 1
First, I want to know if there is jaccard's distance implemented in Rapid miner ? I want to know how to compute the distance between objects/items using Data miner widgets . Also , I don't know what is the data format. Say for example I need to find the distance between the items in the following matrix and rank them how to do it using widgets?
val1 val2 val3 val4
item1 1 0 1 1
item2 0 1 0 1
item3 1 1 1 0
item4 0 1 1 1
0
Answers
To calculate distances in general, e.g. the distance of each example to each other, use the Cross Distances operator. Use the same dataset as input for both req and ref. You can select from a number of measures the one that you'd like to calculate.
Best regards,
Marius
There is a chapter in this upcoming book http://www.crcpress.com/product/isbn/9781482205497 that describes how to do this. If you can't wait that long, I can tell you that it uses an R package called profdpm. The function pci() within that calculates the Jaccard index.
regards
Andrew