Adding all values in one column
Hello,
I have a column of data with a bunch of numeric values in it. I am trying to add up all the values for a total sum. However, all the operators I have tried have simple added all the numeric values across rows and then given me the total for that specific row.
Is there a way I can add all the values in only one column and produce a total sum?
Thanks!
Best Answer
-
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
Amotley,
I think the aggregate operator is the operator you need, but you may not be familiar with how to use it. Take a look at this quick sample process (see the xml below, which you should be able to paste into your own RapidMiner window to see the process).
All this does is generate some random sales data (100 examples, or rows) and then sum the "single price" attribute (column) to get a total.
Also note that the "aggregate" operator allows you to get other summary statistics (perhaps you want the average as well as the sum) and also create subtotals (the parameter "group by" specifies this) if you want.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.1.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_sales_data" compatibility="7.1.001" expanded="true" height="68" name="Generate Sales Data" width="90" x="112" y="136"/>
<operator activated="true" class="aggregate" compatibility="7.1.001" expanded="true" height="82" name="Aggregate" width="90" x="313" y="136">
<list key="aggregation_attributes">
<parameter key="single_price" value="sum"/>
</list>
</operator>
<connect from_op="Generate Sales Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0
Answers
Hi amotley,
have you had a look on the Aggregate operator? I think it solves your problem
Best,
Martin
Dortmund, Germany
Martin,
Yes I have tried that operator. It still is not adding up all the values in one single column that I would like it to.
For example, if I have a column with these values entered:
1
2
3
4
I want to be able to calculate the sum to be 10.
I'm not finding a way to do that using the aggregate operator.
Is there a way to do that?
Thanks
Check this process
https://github.com/patilbhupendra/Sample_RapidMiner_Processes/blob/master/add%20value%20in%20a%20column.rmp
Notice there is no group by used
Hopefully it helps you
That helped a lot! Thank you!!