The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Weights as 'Global importance of attributes' - method of calculation?

kypexinkypexin RapidMiner Certified Analyst, Member Posts: 291 Unicorn
edited June 2019 in Help

Hi rapidminers, 

 

Auto model has 'General' tab in results, which in turn has 'Weights' section.

 

Screenshot 2018-04-10 164629png

 

The explanation goes like "Weights: the global importance of each Attribute for the value of the target Attribute, independent of the modeling algorithm." 

 

What is 'global importance' in this case? What exact algorithm is used to calculate those weights specifically in auto-model process?

Answers

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    tagging @IngoRM

     

    Scott

     

     

  • IngoRMIngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Hi Vladimir,

     

    Auto Model uses the correlation (squared Pearson) between the attributes and the label column.  By the way, the next version will actually change the calculation slightly to better cover nominal values and improve understandability:

     

    1. Auto Model will allow you to open the Weights building process just as it does for the modeling processes.  So you can see exactly how the data is preprocessed and the weights are calculated.
    2. It also will change the preprocessing and weights calculation a bit so that nominal attributes and especially multi-class labels are now better supported.

    Cheers,

    Ingo

  • kypexinkypexin RapidMiner Certified Analyst, Member Posts: 291 Unicorn

    Thanks Ingo, 

     

    So the closest operator to mirror this process would be 'Weight by correlation' I guess?

  • IngoRMIngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Yes, that's the operator which is used.  I will attach the new, complete processes below as well.  Please note that they will be different for numerical labels and nominals ones.  But please note that those processes have not been published yet in the current version of AM...

     

    Numerical Labels:

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve Data" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Titanic"/>
    <description align="center" color="transparent" colored="false" width="126">Load data.</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Preprocessing" width="90" x="179" y="34">
    <process expanded="true">
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Define Target?" width="90" x="45" y="34">
    <parameter key="select_which" value="2"/>
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Define Target" width="90" x="45" y="34">
    <parameter key="attribute_name" value="Survived"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    <description align="center" color="transparent" colored="false" width="126">Define the target column for the predictive model.</description>
    </operator>
    <connect from_port="input 1" to_op="Define Target" to_port="example set input"/>
    <connect from_op="Define Target" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should define a target column?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Should Discretize?" width="90" x="179" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="discretize_by_bins" compatibility="8.1.001" expanded="true" height="103" name="Binning" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Age"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="range_name_type" value="short"/>
    <description align="center" color="transparent" colored="false" width="126">Discretize by binning (same range per bin).</description>
    </operator>
    <connect from_port="input 1" to_op="Binning" to_port="example set input"/>
    <connect from_op="Binning" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="discretize_by_frequency" compatibility="8.1.001" expanded="true" height="103" name="Frequency" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Age"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="range_name_type" value="short"/>
    <description align="center" color="transparent" colored="false" width="126">Discretize by frequency (same count per bin).</description>
    </operator>
    <connect from_port="input 1" to_op="Frequency" to_port="example set input"/>
    <connect from_op="Frequency" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should discretize numerical target column?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Map Values?" width="90" x="313" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="map" compatibility="8.1.001" expanded="true" height="82" name="Map Values" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Survived"/>
    <parameter key="include_special_attributes" value="true"/>
    <list key="value_mappings"/>
    <description align="center" color="transparent" colored="false" width="126">Map some nominal target values to new values.</description>
    </operator>
    <connect from_port="input 1" to_op="Map Values" to_port="example set input"/>
    <connect from_op="Map Values" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should map nominal values?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Positive Class?" width="90" x="447" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="nominal_to_binominal" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Survived"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Make sure that target is binary for positive class mapping.</description>
    </operator>
    <operator activated="true" class="remap_binominals" compatibility="8.1.001" expanded="true" height="82" name="Define Positive Class" width="90" x="179" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Survived"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="negative_value" value="No"/>
    <parameter key="positive_value" value="Yes"/>
    <description align="center" color="transparent" colored="false" width="126">Potentially define which one should be the positive class.</description>
    </operator>
    <connect from_port="input 1" to_op="Nominal to Binominal" to_port="example set input"/>
    <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Define Positive Class" to_port="example set input"/>
    <connect from_op="Define Positive Class" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should define positive class?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Remove Columns?" width="90" x="581" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Columns" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="regular_expression"/>
    <parameter key="regular_expression" value="Name|Ticket Number|Cabin|Life Boat"/>
    <parameter key="invert_selection" value="true"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Potentially remove columns.</description>
    </operator>
    <connect from_port="input 1" to_op="Remove Columns" to_port="example set input"/>
    <connect from_op="Remove Columns" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should remove columns?</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Unify Value Types" width="90" x="715" y="34">
    <process expanded="true">
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Dates" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="date_time"/>
    <parameter key="invert_selection" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Remove all date columns.</description>
    </operator>
    <operator activated="true" class="nominal_to_text" compatibility="8.1.001" expanded="true" height="82" name="Nominal to Text" width="90" x="179" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Transform all nominal columns to text so that we make sure that all will have polynominal type after the next transformation.</description>
    </operator>
    <operator activated="true" class="text_to_nominal" compatibility="8.1.001" expanded="true" height="82" name="Text to Nominal" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Transform all text columns into polynominal columns.</description>
    </operator>
    <operator activated="true" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="447" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="use_value_type_exception" value="true"/>
    <parameter key="except_value_type" value="integer"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Turn all numerical columns (not integers though) into real columns.</description>
    </operator>
    <connect from_port="in 1" to_op="Remove Dates" to_port="example set input"/>
    <connect from_op="Remove Dates" from_port="example set output" to_op="Nominal to Text" to_port="example set input"/>
    <connect from_op="Nominal to Text" from_port="example set output" to_op="Text to Nominal" to_port="example set input"/>
    <connect from_op="Text to Nominal" from_port="example set output" to_op="Numerical to Real" to_port="example set input"/>
    <connect from_op="Numerical to Real" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Unify all value types</description>
    </operator>
    <connect from_port="in 1" to_op="Define Target?" to_port="input 1"/>
    <connect from_op="Define Target?" from_port="output 1" to_op="Should Discretize?" to_port="input 1"/>
    <connect from_op="Should Discretize?" from_port="output 1" to_op="Map Values?" to_port="input 1"/>
    <connect from_op="Map Values?" from_port="output 1" to_op="Positive Class?" to_port="input 1"/>
    <connect from_op="Positive Class?" from_port="output 1" to_op="Remove Columns?" to_port="input 1"/>
    <connect from_op="Remove Columns?" from_port="output 1" to_op="Unify Value Types" to_port="in 1"/>
    <connect from_op="Unify Value Types" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">All general preprocessing steps happen inside this operator - double click on it to see the details.</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Handle Label?" width="90" x="313" y="34">
    <parameter key="select_which" value="2"/>
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="filter_examples" compatibility="8.1.001" expanded="true" height="103" name="Remove Missing Label Rows" width="90" x="45" y="34">
    <parameter key="condition_class" value="no_missing_labels"/>
    <list key="filters_list"/>
    <description align="center" color="transparent" colored="false" width="126">Potentially remove all rows which have a missing label.</description>
    </operator>
    <connect from_port="input 1" to_op="Remove Missing Label Rows" to_port="example set input"/>
    <connect from_op="Remove Missing Label Rows" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Handle missings in label column?</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Replace Missing Values" width="90" x="447" y="34">
    <process expanded="true">
    <operator activated="true" class="replace_missing_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Nominal Missings" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="default" value="value"/>
    <list key="columns"/>
    <parameter key="replenishment_value" value="MISSING"/>
    <description align="center" color="transparent" colored="false" width="126">Replace nominal missings with the word 'missing'.</description>
    </operator>
    <operator activated="true" class="replace_infinite_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Pos Infinite Values" width="90" x="179" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="default" value="missing"/>
    <list key="columns"/>
    <description align="center" color="transparent" colored="false" width="126">Replace positive infinity values by missing.</description>
    </operator>
    <operator activated="true" class="replace_infinite_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Neg Infinite Values" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="default" value="missing"/>
    <list key="columns"/>
    <parameter key="replenish_what" value="negative_infinity"/>
    <description align="center" color="transparent" colored="false" width="126">Replace negative infinity values by missing.</description>
    </operator>
    <operator activated="true" class="replace_missing_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Numerical Missings" width="90" x="447" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="numeric"/>
    <list key="columns"/>
    <description align="center" color="transparent" colored="false" width="126">Replace numerical missings with the average of the column.</description>
    </operator>
    <connect from_port="in 1" to_op="Replace Nominal Missings" to_port="example set input"/>
    <connect from_op="Replace Nominal Missings" from_port="example set output" to_op="Replace Pos Infinite Values" to_port="example set input"/>
    <connect from_op="Replace Pos Infinite Values" from_port="example set output" to_op="Replace Neg Infinite Values" to_port="example set input"/>
    <connect from_op="Replace Neg Infinite Values" from_port="example set output" to_op="Replace Numerical Missings" to_port="example set input"/>
    <connect from_op="Replace Numerical Missings" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Replace missing values.</description>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="581" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="nominal"/>
    <description align="center" color="transparent" colored="false" width="126">Check if there are any nominal attributes in the data</description>
    </operator>
    <operator activated="true" class="branch" compatibility="8.1.001" expanded="true" height="103" name="Branch (2)" width="90" x="715" y="34">
    <parameter key="condition_type" value="min_attributes"/>
    <parameter key="condition_value" value="1"/>
    <process expanded="true">
    <operator activated="true" class="concurrency:loop_attributes" compatibility="8.1.001" expanded="true" height="82" name="Loop Attributes" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="reuse_results" value="true"/>
    <parameter key="enable_parallel_execution" value="false"/>
    <process expanded="true">
    <operator activated="true" class="aggregate" compatibility="8.1.001" expanded="true" height="82" name="Aggregate" width="90" x="45" y="34">
    <list key="aggregation_attributes"/>
    <parameter key="group_by_attributes" value="%{loop_attribute}"/>
    <description align="center" color="transparent" colored="false" width="126">Create a new data set with one row for each nominal value of the current column (loop).</description>
    </operator>
    <operator activated="true" class="branch" compatibility="8.1.001" expanded="true" height="103" name="Branch" width="90" x="179" y="34">
    <parameter key="condition_type" value="min_examples"/>
    <parameter key="condition_value" value="10"/>
    <process expanded="true">
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="%{loop_attribute}"/>
    <parameter key="invert_selection" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">More than 10 values? Remove current column.</description>
    </operator>
    <connect from_port="input 1" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_port="input 1"/>
    <portSpacing port="source_condition" spacing="0"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_input 1" spacing="0"/>
    <portSpacing port="sink_input 2" spacing="0"/>
    <portSpacing port="sink_input 3" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="aggregate" compatibility="8.1.001" expanded="true" height="82" name="Aggregate (2)" width="90" x="45" y="34">
    <list key="aggregation_attributes">
    <parameter key="%{loop_attribute}" value="count"/>
    </list>
    <parameter key="group_by_attributes" value="%{loop_attribute}"/>
    <description align="center" color="transparent" colored="false" width="126">Count number of occurences for each value.</description>
    </operator>
    <operator activated="true" class="sort" compatibility="8.1.001" expanded="true" height="82" name="Sort" width="90" x="179" y="34">
    <parameter key="attribute_name" value="count(%{loop_attribute})"/>
    <description align="center" color="transparent" colored="false" width="126">Sort counts.</description>
    </operator>
    <operator activated="true" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro" width="90" x="313" y="34">
    <parameter key="macro" value="least_common"/>
    <parameter key="macro_type" value="data_value"/>
    <parameter key="attribute_name" value="%{loop_attribute}"/>
    <parameter key="example_index" value="1"/>
    <list key="additional_macros"/>
    <description align="center" color="transparent" colored="false" width="126">Remember value with smallest count.</description>
    </operator>
    <operator activated="true" class="nominal_to_numerical" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Numerical (2)" width="90" x="447" y="136">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="%{loop_attribute}"/>
    <parameter key="use_comparison_groups" value="true"/>
    <list key="comparison_groups">
    <parameter key="%{loop_attribute}" value="%{least_common}"/>
    </list>
    <description align="center" color="transparent" colored="false" width="126">Transform to binary using dummy coding and a comparison group for the least frequent value.</description>
    </operator>
    <connect from_port="input 1" to_op="Aggregate (2)" to_port="example set input"/>
    <connect from_op="Aggregate (2)" from_port="example set output" to_op="Sort" to_port="example set input"/>
    <connect from_op="Aggregate (2)" from_port="original" to_op="Nominal to Numerical (2)" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
    <connect from_op="Extract Macro" from_port="example set" to_port="input 2"/>
    <connect from_op="Nominal to Numerical (2)" from_port="example set output" to_port="input 1"/>
    <portSpacing port="source_condition" spacing="0"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_input 1" spacing="0"/>
    <portSpacing port="sink_input 2" spacing="0"/>
    <portSpacing port="sink_input 3" spacing="0"/>
    <description align="center" color="yellow" colored="false" height="66" resized="false" width="126" x="40" y="210">Less than 10 values? Transform into binary.</description>
    </process>
    <description align="center" color="transparent" colored="false" width="126">If more than 10, remove column. If less, transform to binary.</description>
    </operator>
    <connect from_port="input 1" to_op="Aggregate" to_port="example set input"/>
    <connect from_op="Aggregate" from_port="example set output" to_op="Branch" to_port="condition"/>
    <connect from_op="Aggregate" from_port="original" to_op="Branch" to_port="input 1"/>
    <connect from_op="Branch" from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Remove nominal columns with too many values, transform the others to binary.</description>
    </operator>
    <connect from_port="input 1" to_op="Loop Attributes" to_port="input 1"/>
    <connect from_op="Loop Attributes" from_port="output 1" to_port="input 1"/>
    <portSpacing port="source_condition" spacing="0"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_input 1" spacing="0"/>
    <portSpacing port="sink_input 2" spacing="0"/>
    </process>
    <process expanded="true">
    <connect from_port="input 1" to_port="input 1"/>
    <portSpacing port="source_condition" spacing="0"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_input 1" spacing="0"/>
    <portSpacing port="sink_input 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">If there are nominal attributes, handle them inside</description>
    </operator>
    <operator activated="true" class="remove_useless_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Useless Attributes" width="90" x="849" y="34">
    <description align="center" color="transparent" colored="false" width="126">Remove attributes which might not become useless after the nominal transformation</description>
    </operator>
    <operator activated="true" class="branch" compatibility="8.1.001" expanded="true" height="82" name="Sample based on No Attributes" width="90" x="983" y="34">
    <parameter key="condition_type" value="max_attributes"/>
    <parameter key="condition_value" value="100"/>
    <process expanded="true">
    <operator activated="true" class="sample_stratified" compatibility="8.1.001" expanded="true" height="82" name="Few Atts Sample" width="90" x="45" y="34">
    <parameter key="sample_size" value="50000"/>
    <description align="center" color="transparent" colored="false" width="126">Less than 100 attributes? Then sample down to 50,000 data points.</description>
    </operator>
    <connect from_port="condition" to_op="Few Atts Sample" to_port="example set input"/>
    <connect from_op="Few Atts Sample" from_port="example set output" to_port="input 1"/>
    <portSpacing port="source_condition" spacing="0"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_input 1" spacing="0"/>
    <portSpacing port="sink_input 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="sample_stratified" compatibility="8.1.001" expanded="true" height="82" name="Many Atts Sample" width="90" x="45" y="34">
    <parameter key="sample_size" value="5000"/>
    <description align="center" color="transparent" colored="false" width="126">More than 100 attributes? Then sample down to 5000 data points.</description>
    </operator>
    <connect from_port="condition" to_op="Many Atts Sample" to_port="example set input"/>
    <connect from_op="Many Atts Sample" from_port="example set output" to_port="input 1"/>
    <portSpacing port="source_condition" spacing="0"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_input 1" spacing="0"/>
    <portSpacing port="sink_input 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Sample data down based on the number of attributes.</description>
    </operator>
    <operator activated="true" class="weight_by_correlation" compatibility="8.1.001" expanded="true" height="82" name="Weight by Correlation (2)" width="90" x="1117" y="34">
    <parameter key="normalize_weights" value="true"/>
    <parameter key="sort_weights" value="false"/>
    <parameter key="sort_direction" value="descending"/>
    <description align="center" color="transparent" colored="false" width="126">Weight based on correlation.</description>
    </operator>
    <operator activated="true" class="weights_to_data" compatibility="8.1.001" expanded="true" height="68" name="Weights to Data" width="90" x="1251" y="34">
    <description align="center" color="transparent" colored="false" width="126">Transform the average weights into a data set which can be normalized and sorted.</description>
    </operator>
    <operator activated="true" class="normalize" compatibility="8.1.001" expanded="true" height="103" name="Normalize" width="90" x="1385" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Weight"/>
    <parameter key="method" value="range transformation"/>
    <description align="center" color="transparent" colored="false" width="126">Normalize the weights.</description>
    </operator>
    <operator activated="true" class="sort" compatibility="8.1.001" expanded="true" height="82" name="Sort (2)" width="90" x="1519" y="34">
    <parameter key="attribute_name" value="Weight"/>
    <parameter key="sorting_direction" value="decreasing"/>
    <description align="center" color="transparent" colored="false" width="126">Sort the weights.</description>
    </operator>
    <operator activated="true" class="order_attributes" compatibility="8.1.001" expanded="true" height="82" name="Reorder Attributes" width="90" x="1653" y="34">
    <parameter key="attribute_ordering" value="Attribute|Weight"/>
    <description align="center" color="transparent" colored="false" width="126">Reorder the attributes with attribute name in the first column.</description>
    </operator>
    <connect from_op="Retrieve Data" from_port="output" to_op="Preprocessing" to_port="in 1"/>
    <connect from_op="Preprocessing" from_port="out 1" to_op="Handle Label?" to_port="input 1"/>
    <connect from_op="Handle Label?" from_port="output 1" to_op="Replace Missing Values" to_port="in 1"/>
    <connect from_op="Replace Missing Values" from_port="out 1" to_op="Select Attributes (2)" to_port="example set input"/>
    <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Branch (2)" to_port="condition"/>
    <connect from_op="Select Attributes (2)" from_port="original" to_op="Branch (2)" to_port="input 1"/>
    <connect from_op="Branch (2)" from_port="input 1" to_op="Remove Useless Attributes" to_port="example set input"/>
    <connect from_op="Remove Useless Attributes" from_port="example set output" to_op="Sample based on No Attributes" to_port="condition"/>
    <connect from_op="Sample based on No Attributes" from_port="input 1" to_op="Weight by Correlation (2)" to_port="example set"/>
    <connect from_op="Weight by Correlation (2)" from_port="weights" to_op="Weights to Data" to_port="attribute weights"/>
    <connect from_op="Weights to Data" from_port="example set" to_op="Normalize" to_port="example set input"/>
    <connect from_op="Normalize" from_port="example set output" to_op="Sort (2)" to_port="example set input"/>
    <connect from_op="Sort (2)" from_port="example set output" to_op="Reorder Attributes" to_port="example set input"/>
    <connect from_op="Reorder Attributes" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Nominal:

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve Data" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Titanic"/>
    <description align="center" color="transparent" colored="false" width="126">Load data.</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Preprocessing" width="90" x="179" y="34">
    <process expanded="true">
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Define Target?" width="90" x="45" y="34">
    <parameter key="select_which" value="2"/>
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Define Target" width="90" x="45" y="34">
    <parameter key="attribute_name" value="Survived"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    <description align="center" color="transparent" colored="false" width="126">Define the target column for the predictive model.</description>
    </operator>
    <connect from_port="input 1" to_op="Define Target" to_port="example set input"/>
    <connect from_op="Define Target" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should define a target column?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Should Discretize?" width="90" x="179" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="discretize_by_bins" compatibility="8.1.001" expanded="true" height="103" name="Binning" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Age"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="range_name_type" value="short"/>
    <description align="center" color="transparent" colored="false" width="126">Discretize by binning (same range per bin).</description>
    </operator>
    <connect from_port="input 1" to_op="Binning" to_port="example set input"/>
    <connect from_op="Binning" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="discretize_by_frequency" compatibility="8.1.001" expanded="true" height="103" name="Frequency" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Age"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="range_name_type" value="short"/>
    <description align="center" color="transparent" colored="false" width="126">Discretize by frequency (same count per bin).</description>
    </operator>
    <connect from_port="input 1" to_op="Frequency" to_port="example set input"/>
    <connect from_op="Frequency" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should discretize numerical target column?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Map Values?" width="90" x="313" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="map" compatibility="8.1.001" expanded="true" height="82" name="Map Values" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Survived"/>
    <parameter key="include_special_attributes" value="true"/>
    <list key="value_mappings"/>
    <description align="center" color="transparent" colored="false" width="126">Map some nominal target values to new values.</description>
    </operator>
    <connect from_port="input 1" to_op="Map Values" to_port="example set input"/>
    <connect from_op="Map Values" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should map nominal values?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Positive Class?" width="90" x="447" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="nominal_to_binominal" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Survived"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Make sure that target is binary for positive class mapping.</description>
    </operator>
    <operator activated="true" class="remap_binominals" compatibility="8.1.001" expanded="true" height="82" name="Define Positive Class" width="90" x="179" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Survived"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="negative_value" value="No"/>
    <parameter key="positive_value" value="Yes"/>
    <description align="center" color="transparent" colored="false" width="126">Potentially define which one should be the positive class.</description>
    </operator>
    <connect from_port="input 1" to_op="Nominal to Binominal" to_port="example set input"/>
    <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Define Positive Class" to_port="example set input"/>
    <connect from_op="Define Positive Class" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should define positive class?</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Remove Columns?" width="90" x="581" y="34">
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Columns" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="regular_expression"/>
    <parameter key="regular_expression" value="Name|Ticket Number|Cabin|Life Boat"/>
    <parameter key="invert_selection" value="true"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Potentially remove columns.</description>
    </operator>
    <connect from_port="input 1" to_op="Remove Columns" to_port="example set input"/>
    <connect from_op="Remove Columns" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Should remove columns?</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Unify Value Types" width="90" x="715" y="34">
    <process expanded="true">
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Dates" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="date_time"/>
    <parameter key="invert_selection" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Remove all date columns.</description>
    </operator>
    <operator activated="true" class="nominal_to_text" compatibility="8.1.001" expanded="true" height="82" name="Nominal to Text" width="90" x="179" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Transform all nominal columns to text so that we make sure that all will have polynominal type after the next transformation.</description>
    </operator>
    <operator activated="true" class="text_to_nominal" compatibility="8.1.001" expanded="true" height="82" name="Text to Nominal" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Transform all text columns into polynominal columns.</description>
    </operator>
    <operator activated="true" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="447" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="use_value_type_exception" value="true"/>
    <parameter key="except_value_type" value="integer"/>
    <parameter key="include_special_attributes" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Turn all numerical columns (not integers though) into real columns.</description>
    </operator>
    <connect from_port="in 1" to_op="Remove Dates" to_port="example set input"/>
    <connect from_op="Remove Dates" from_port="example set output" to_op="Nominal to Text" to_port="example set input"/>
    <connect from_op="Nominal to Text" from_port="example set output" to_op="Text to Nominal" to_port="example set input"/>
    <connect from_op="Text to Nominal" from_port="example set output" to_op="Numerical to Real" to_port="example set input"/>
    <connect from_op="Numerical to Real" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Unify all value types</description>
    </operator>
    <connect from_port="in 1" to_op="Define Target?" to_port="input 1"/>
    <connect from_op="Define Target?" from_port="output 1" to_op="Should Discretize?" to_port="input 1"/>
    <connect from_op="Should Discretize?" from_port="output 1" to_op="Map Values?" to_port="input 1"/>
    <connect from_op="Map Values?" from_port="output 1" to_op="Positive Class?" to_port="input 1"/>
    <connect from_op="Positive Class?" from_port="output 1" to_op="Remove Columns?" to_port="input 1"/>
    <connect from_op="Remove Columns?" from_port="output 1" to_op="Unify Value Types" to_port="in 1"/>
    <connect from_op="Unify Value Types" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">All general preprocessing steps happen inside this operator - double click on it to see the details.</description>
    </operator>
    <operator activated="true" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Handle Label?" width="90" x="313" y="34">
    <parameter key="select_which" value="2"/>
    <process expanded="true">
    <connect from_port="input 1" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="filter_examples" compatibility="8.1.001" expanded="true" height="103" name="Remove Missing Label Rows" width="90" x="45" y="34">
    <parameter key="condition_class" value="no_missing_labels"/>
    <list key="filters_list"/>
    <description align="center" color="transparent" colored="false" width="126">Potentially remove all rows which have a missing label.</description>
    </operator>
    <connect from_port="input 1" to_op="Remove Missing Label Rows" to_port="example set input"/>
    <connect from_op="Remove Missing Label Rows" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Handle missings in label column?</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Replace Missing Values" width="90" x="447" y="34">
    <process expanded="true">
    <operator activated="true" class="replace_missing_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Nominal Missings" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="default" value="value"/>
    <list key="columns"/>
    <parameter key="replenishment_value" value="MISSING"/>
    <description align="center" color="transparent" colored="false" width="126">Replace nominal missings with the word 'missing'.</description>
    </operator>
    <operator activated="true" class="replace_infinite_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Pos Infinite Values" width="90" x="179" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="default" value="missing"/>
    <list key="columns"/>
    <description align="center" color="transparent" colored="false" width="126">Replace positive infinity values by missing.</description>
    </operator>
    <operator activated="true" class="replace_infinite_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Neg Infinite Values" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="default" value="missing"/>
    <list key="columns"/>
    <parameter key="replenish_what" value="negative_infinity"/>
    <description align="center" color="transparent" colored="false" width="126">Replace negative infinity values by missing.</description>
    </operator>
    <operator activated="true" class="replace_missing_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Numerical Missings" width="90" x="447" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="numeric"/>
    <list key="columns"/>
    <description align="center" color="transparent" colored="false" width="126">Replace numerical missings with the average of the column.</description>
    </operator>
    <connect from_port="in 1" to_op="Replace Nominal Missings" to_port="example set input"/>
    <connect from_op="Replace Nominal Missings" from_port="example set output" to_op="Replace Pos Infinite Values" to_port="example set input"/>
    <connect from_op="Replace Pos Infinite Values" from_port="example set output" to_op="Replace Neg Infinite Values" to_port="example set input"/>
    <connect from_op="Replace Neg Infinite Values" from_port="example set output" to_op="Replace Numerical Missings" to_port="example set input"/>
    <connect from_op="Replace Numerical Missings" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Replace missing values.</description>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="581" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="nominal"/>
    <description align="center" color="transparent" colored="false" width="126">Check if there are any nominal attributes in the data</description>
    </operator>
    <operator activated="true" class="branch" compatibility="8.1.001" expanded="true" height="103" name="Branch (2)" width="90" x="715" y="34">
    <parameter key="condition_type" value="min_attributes"/>
    <parameter key="condition_value" value="1"/>
    <process expanded="true">
    <operator activated="true" class="concurrency:loop_attributes" compatibility="8.1.001" expanded="true" height="82" name="Loop Attributes" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="value_type" value="
  • kypexinkypexin RapidMiner Certified Analyst, Member Posts: 291 Unicorn

    Thanks Ingo, pertty helpful (and impressive, too!) :)

Sign In or Register to comment.