Ponderated Sum Different Attributes
Hi Everybody, i'm new to Rapidminer and i'm doing a project with a Medical database.
I have 17 attributes with values of 1 and 0. Some values have a score of 1, 2, 3 and other 6. I want to create a new attribute which contains the sum of the scores depending on the value of the different attributes.
For example i want to sum the score of Attribute_1 to Attribute_17 only when they are 1 and next i want to sum the different scores of the different Attributes in one new Attribute (Sum Score)
I know this must be a easy problem, but i can't seem to find the answer, i tried "generate attribute" and followed a "If-Then" logic, but i can't sum the scores of the different attributes, i can only have the last one positive.
Thank you in advance, i hope you can help me.
Best Answers
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
Hi again @emanuelmcruz,
If attributes can only have 2 values (0 or 1), you can use the Generate Aggregation operator :
<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="8.2.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
<parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Somme_Si\Somme_Si.xlsx"/>
<parameter key="imported_cell_range" value="A1:G3"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="Att1.true.integer.attribute"/>
<parameter key="1" value="Att2.true.integer.attribute"/>
<parameter key="2" value="Att3.true.integer.attribute"/>
<parameter key="3" value="Att4.true.integer.attribute"/>
<parameter key="4" value="Att5.true.integer.attribute"/>
<parameter key="5" value="Att6.true.integer.attribute"/>
<parameter key="6" value="Att7.true.integer.attribute"/>
</list>
</operator>
<operator activated="true" class="generate_aggregation" compatibility="8.2.000" expanded="true" height="82" name="Generate Aggregation" width="90" x="179" y="34">
<parameter key="attribute_name" value="sum"/>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Generate Aggregation" to_port="example set input"/>
<connect from_op="Generate Aggregation" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>I hope it helps,
Regards,
Lionel
1 -
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
Counting the positives is the same as computing the sum if your possible values are only zero or one. So "Generate Aggregation" across your 17 attributes with the function "sum" should do the trick for you.
0
Answers
Hi @emanuelmcruz
Can you
- share your dataset and
- based on an extract of your dataset, post an example of what you want to obtain.
Regards,
Lionel
I want to make a new column in which it represent the number of times the different previous attributes is positive.
For example in this image: i want to make a column where it counts the number of attributes from the 17 different ones, that are positive.
It works like a index, meaning is a attribute is positive it has a score, and i want to sum the counts in which this attributes are positive.
It's a Charlson Comorbidity Index
Hi again @emanuelmcruz,
To be sure to understand, what you want to obtain is like that (example with 7 attributes) :
mpp
Regards,
Lionel
Exactly like that,the attributes can only be 1 or 0, but the last column its correct.
How can i do that?
I use the Generate Aggregation, but i can't seem to do the Parameters right, do i have to select attributes, and then Generate Aggregation, but which parameters i use, so i can count only the positives
Hi again @emanuelmcruz,
I have difficulties to understand the content of your dataset :
I thought that you have only 0 and 1 on your dataset, so all your values are positive ?
Finally, to sum up, you have 17 attributes with only 0 and 1 and additionnal attribute(s) with negative values ? that's right ?
Regards,
Lionel