Using dynamic macro variables based on column content
Dear all,
I'd like to know if there is a possibility to use macros by refering to them via column content. The selection, which predefined macro shall be used, is depending on the content of a specific column.
A simplified example
Input table
country | base_value | result_value |
---|---|---|
GER | 10 | ? |
USA | 5 | ? |
Available macro variables
m_multiplier_GER: 2
m_multiplier_USA: 3
I want to calculate result_value by multiplying the base_value with the value of the macro matching the country in its very own name.
Expected result
country | base_value | result_value |
---|---|---|
GER | 10 | 20 |
USA | 5 | 15 |
In an "Generate Attributes" operator, I'd need something like [result_value= base_value * %{m_multiplier_country@}] where "@" represents some sort of escape character to be able to include the column content in the macro name.
My current solution is to put the macro information in a table and merge it via the country column. However this produces a lot of duplicated information in my table, which I have to carry through my process and remove afterwards. The process contains of multiple steps that each need different macro information each.<\p>
I hope to find a more elegant solution that allows me to use the macros directly.
Thanks for the help
Denis Wolfrath
Best Answer
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
Hi @denis_wolfrath,
To improve my knowledge on macros and as a personnal challenge, here the process
using macros to perform your task (thanks to the ressource of @mschmitz How to Use Macros). It's far-fetched but it works :
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
<parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Dynamic_Macros.xlsx"/>
<parameter key="imported_cell_range" value="A1:C3"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="Country.true.polynominal.attribute"/>
<parameter key="1" value="base_value.true.integer.attribute"/>
<parameter key="2" value="result value.true.attribute_value.attribute"/>
</list>
</operator>
<operator activated="true" class="set_macros" compatibility="8.0.001" expanded="true" height="82" name="Set Macros (2)" width="90" x="179" y="34">
<list key="macros">
<parameter key="country_macro" value="Country"/>
<parameter key="m_multiplier_GER" value="2"/>
<parameter key="m_multiplier_USA" value="3"/>
</list>
</operator>
<operator activated="true" class="select_attributes" compatibility="8.0.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="313" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attribute" value="base_value"/>
<parameter key="attributes" value="Country|base_value"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="8.0.001" expanded="true" height="82" name="Generate Attributes (6)" width="90" x="447" y="34">
<list key="function_descriptions">
<parameter key="result_value" value="eval(concat(str(base_value),"*",eval((concat("%{",concat("m_multiplier_",eval(%{country_macro})),"}")),NOMINAL)),REAL)"/>
</list>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Set Macros (2)" to_port="through 1"/>
<connect from_op="Set Macros (2)" from_port="through 1" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_op="Generate Attributes (6)" to_port="example set input"/>
<connect from_op="Generate Attributes (6)" from_port="example set output" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>Here your fictive example set :
https://drive.google.com/open?id=1KdToSAP1j7Q5lwLTlLvhBRLgitQ2BUh_
Best regards,
Lionel
3
Answers
Hi Denis,
my way of solving it would be a table with the multiplies. Like
Country multiplier
USA 3
Germany 5
...
you can join this table on your data and then simply do the calculations without any macros needed.
Best,
Martin
Dortmund, Germany
@lionelderkrikor
many thanks, your way solves my problem perfectly.
Using concat to build the macro statements during execution opens even more posibilities.
Kind regards
Denis Wolfrath
Hi @denis_wolfrath,
this indeed gives you even more flexibility if you want to. It even allows you to use different functions for different values etc. Note that this might slow down the execution, since some of our internal optimizers cannot work on it. Since Generate Attributes is usually running only seconds it should not make a huge difference though.
Best,
Martin
Dortmund, Germany