The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[Solved] Weighting examples
Dear all,
Does anyone happen to know whether there is a good way to weight examples?
I would like to achieve that newer examples are weighted higher.
ID att1 att2 weight
a 12 45 1
b 10 27 2
c 33 17 3
I tried to loop over all examples and write the iteration macro into a new generated attribute but that didn't work.
Thank you for sharing your ideas...
Sachs
Does anyone happen to know whether there is a good way to weight examples?
I would like to achieve that newer examples are weighted higher.
ID att1 att2 weight
a 12 45 1
b 10 27 2
c 33 17 3
I tried to loop over all examples and write the iteration macro into a new generated attribute but that didn't work.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
<process expanded="true" height="458" width="640">
<operator activated="true" class="generate_data" compatibility="5.2.003" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
<operator activated="true" class="loop_examples" compatibility="5.2.003" expanded="true" height="76" name="Loop Examples" width="90" x="179" y="30">
<process expanded="true" height="476" width="640">
<operator activated="true" class="generate_attributes" compatibility="5.2.003" expanded="true" height="76" name="Generate Attributes" width="90" x="45" y="30">
<list key="function_descriptions">
<parameter key="weight" value="%{example}"/>
</list>
</operator>
<connect from_port="example set" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_port="example set"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
</process>
</operator>
<operator activated="true" class="set_role" compatibility="5.2.003" expanded="true" height="76" name="Set Role" width="90" x="313" y="30">
<parameter key="name" value="weight"/>
<parameter key="target_role" value="weight"/>
<list key="set_additional_roles"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Loop Examples" to_port="example set"/>
<connect from_op="Loop Examples" from_port="example set" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Thank you for sharing your ideas...
Sachs
Tagged:
0
Answers
you can try to use the Generate ID operator Best,
Nils
Hi Nils,
thank you for your idea. It seams that I was not precise enough in my formulation. The weight is not supposed to be incremented by one for each example but by a value that could be different each time the process is being run.
So it could also be that weights like this have to be applied:
ID att1 att2 weight
a 12 45 2
b 10 27 4
c 33 17 6
Bye for now & take care
Sachs
Best
H
Ok, that means that I have to create an ID first. In a second step I can generate another attribute then which is a function of ID.
Thank you very much
Kind regards
Sachs
When I generate a new ID the former ID is being removed. Therefore, I have to set another role to the former ID first, generate a new ID, set role of the new ID to weight and finally set role of the former ID back to ID. Just wanted to share that...
All the best
Sachs
Hi haddock,
I tried your proposal and found that it works if the former id is a number.
However, in my data set the id is a date and in this case it doesn't work. No idea why ???
In the attached sample process represents an implementation of your proposal. --> Result is that "data" id attribute is missing.
Connect and activate the two "set role" operators as described in my last post and it works.
Seems to be a bug related to the date type.
http://datahost.bplaced.net/sample4.xls
Best regards
Sachs
Don't want to sound like the Thought Police, so here are some tips towards RM Nirvana.
1. Treat dates as dates!
2. Observe Marius' etiquette on questions.
3. Be careful about bug calling.
That being said, here's some code. Best
H
Hi haddock,
Your are right, bug calling was probably a little too hasty.
Referring to the issue again: In my process the date column is classified as type "date" and role "id". Therefore, my understanding is, that it is treated as a date already. Consequently, I don't understand why I cannot have a column which is both, type "date" and role "id" at the same time in the given setup. (Code see my last post).
All the best
Sachs
Fair enough, you can always declare "date" as an "id" later - that avoids your double id issue, and saves an operator, because you can use the " no double ids " property to advantage, like this.. I spend most of my time CUDA programming, and am probably a bit obsessed by speed and clarity!
Best
H
PS Ignore ( nearly always ) the warnings, they are only warnings, just press the green j!
PPS Green if running on RA, Blue on RM.
So it seems to be a kind of a hidden feature that RapidMiner only allows a single ID in the data set and removes the others automatically.
Thanks & have a nice day
Sachs
Indeedy, data doesn't make much sense when it has more than one identity, bit like humans On the other hand we each contributed to a neat solution, so grouping is cool 8)
Best
H