The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Answers
this question has been asked during the last few days a few times. Here are the answers:
You have two options.
1. Load the data sets and merge them. Calculate a similarity measure for the merged data set. Filter out the combinations where your single data is not part of. Sort the rest. Use the one with the highest similariy. All the necessary operators are part of RapidMiner.
2. If the amount of data is rather large, then the calculation of the full similarity matrix is probably not applicable. In that case, you have to iterate over the examples, use only the current example, calculate the similarity with your single example of interest and store it via ProcessLog. Afterwards you can change the process log back to a data set, sort it etc.
Cheers,
Ingo
Where is the similar post?
Thanks.
I want to compare 2 archives.
historik.txt
1 73 15 16 13 14 15
2 123 25 26 23 24 25
3 173 35 36 33 34 35
4 224 45 46 43 44 46
5 274 55 56 53 54 56
dades.txt
25 26 23 24 25
The correct result would be the second row of the first file . Value: 123
With this code he is not correct. The result with this code is 73. That I have bad?
<operator name="Root" class="Process" expanded="yes">
<parameter key="resultfile" value="/home/rm_workspace/p2/resultat.res"/>
<operator name="InputHistorik" class="ExampleSource">
<parameter key="attributes" value="/home/rm_workspace/p2/historik.aml"/>
</operator>
<operator name="FeatureRangeRemoval" class="FeatureRangeRemoval">
<parameter key="first_attribute" value="1"/>
<parameter key="last_attribute" value="1"/>
</operator>
<operator name="NearestNeighbors" class="NearestNeighbors">
</operator>
<operator name="Diari" class="ExampleSource">
<parameter key="attributes" value="/home/rm_workspace/p2/dades.aml"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
Files aml.
dades.aml
<?xml version="1.0" encoding="UTF-8"?>
<attributeset default_source="dades.dat">
<attribute
name = "dades.txt (1)"
sourcecol = "1"
valuetype = "integer"/>
<attribute
name = "dades.txt (2)"
sourcecol = "2"
valuetype = "integer"/>
<attribute
name = "dades.txt (3)"
sourcecol = "3"
valuetype = "integer"/>
<attribute
name = "dades.txt (4)"
sourcecol = "4"
valuetype = "integer"/>
<attribute
name = "dades.txt (5)"
sourcecol = "5"
valuetype = "integer"/>
</attributeset>
historik.aml
<?xml version="1.0" encoding="UTF-8"?>
<attributeset default_source="historik.dat">
<attribute
name = "historik.txt (1)"
sourcecol = "1"
valuetype = "integer"/>
<label
name = "historik.txt (2)"
sourcecol = "2"
valuetype = "integer"/>
<cluster
name = "historik.txt (3)"
sourcecol = "3"
valuetype = "integer"/>
<attribute
name = "historik.txt (4)"
sourcecol = "4"
valuetype = "integer"/>
<attribute
name = "historik.txt (5)"
sourcecol = "5"
valuetype = "integer"/>
<attribute
name = "historik.txt (6)"
sourcecol = "6"
valuetype = "integer"/>
<attribute
name = "historik.txt (7)"
sourcecol = "7"
valuetype = "integer"/>
</attributeset>
How I can do it?
Thanks.
The answer to your problem is that for some reason only known to yourself you call column three a cluster!
<cluster
name = "historik.txt (3)"
sourcecol = "3"
valuetype = "integer"/>
I've laid out the data in one file like this...
1 73 15 16 13 14 15
2 123 25 26 23 24 25
3 173 35 36 33 34 35
4 224 45 46 43 44 46
5 274 55 56 53 54 56
6 ? 25 26 23 24 25
and made the necessary code changes to this... and rather unsurprisingly the correct answer emerges.
So the answer to is
With more care!
Your code it's not the solution. I woultd compare the atribute 3-7 of file 1 with atribute of file 2 and the result there is atribute 2 of file 1.
The column "cluster" is an error for me.
I would obtain one valor of the second column of file 1. This valor is the valor where the file 1 is the same valor of file 2.
In the example my, on compare 2 files the result it would have to give the second colum of second row of file 1.
Thanks.
1, 73, 15, 16, 13, 14,15
2, 123, 25, 26, 23,24, 25
3, 173, 35, 36, 33, 34, 35
4, 224, 45, 46, 43, 44, 46
5, 274, 55, 56,53, 54, 56
6, , 25, 26, 23, 24, 25
For the same reason I've taken out the second data read and replaced it with a datacopy, like this... If I run this I get "123" as the answer, just like before, so I'm puzzled as to what you mean by the following Perhaps you could enlighten us?
haddock thanks.
I will prove it.