The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Correlation Matrix"

simsim Member Posts: 18 Learner I
edited May 2019 in Help
I am trying to conduct a correlation matrix on some data. However the results do not include a correlation matrix, but rather a table with two columns where all of the attributes only in one column. I have used the "nominal to binomial", "correlation matrix" and "select weights" operators. 
Do you know what I am doing wrong?
Tagged:

Best Answer

Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    If you can post your XML it would be easier to troubleshoot :-)
    But from your description, it sounds like you might be using Weight by Correlation, which only looks at the correlation between attributes and the defined label.  If you want the full correlation matrix you need to use the Correlation Matrix operator instead.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • simsim Member Posts: 18 Learner I
    Hi Telcontar120, 
    Thank you for such a quick response! I have removed the "select weights" attribute, but am still facing the same error. I would upload the XML file, but don't know how to (I'm new to rapidminer), sorry!

    Do you know if there's anything else that I can try?

  • simsim Member Posts: 18 Learner I
    Thank you mschmitz!!! That definitely helped!! I now have my results in the form of a correlation table.
    All of the categories within my attributes are now listed as individual attributes, is there anyway for this to be adjusted? 
    Thank you once again! 
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi,
    pearson correlation is not defined for nominal types. Thus they can't be in.

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • simsim Member Posts: 18 Learner I
    is there an operator than can be used to convert the data so it can be included?
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Well, i would take a measure which can handle this. i.e Weight by Gini Index.
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • simsim Member Posts: 18 Learner I
    does the weight by ginni index convert the data?
  • simsim Member Posts: 18 Learner I
    Hi Weight by Ginni Index did not work for me- is there anything else that I can use?
  • simsim Member Posts: 18 Learner I
    edited January 2019
    Hi mschmitz, hope you're well!
    Just wondering if there was an update?
  • simsim Member Posts: 18 Learner I
    Hi Martin, 

    Just wondering if you've seen my above message?

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi @sim ,
    i would go for something like the attached one. but please keep in mind that this is only defined for not necesarrly normalized w.r.t correlation.
    <?xml version="1.0" encoding="UTF-8"?><process version="9.1.000"><br>  <context><br>    <input/><br>    <output/><br>    <macros/><br>  </context><br>  <operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process"><br>    <parameter key="logverbosity" value="init"/><br>    <parameter key="random_seed" value="2001"/><br>    <parameter key="send_mail" value="never"/><br>    <parameter key="notification_email" value=""/><br>    <parameter key="process_duration_for_mail" value="30"/><br>    <parameter key="encoding" value="SYSTEM"/><br>    <process expanded="true"><br>      <operator activated="true" class="retrieve" compatibility="9.1.000" expanded="true" height="68" name="Retrieve Golf" width="90" x="179" y="85"><br>        <parameter key="repository_entry" value="//Samples/data/Golf"/><br>      </operator><br>      <operator activated="true" class="select_attributes" compatibility="9.1.000" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="85"><br>        <parameter key="attribute_filter_type" value="value_type"/><br>        <parameter key="attribute" value=""/><br>        <parameter key="attributes" value=""/><br>        <parameter key="use_except_expression" value="false"/><br>        <parameter key="value_type" value="nominal"/><br>        <parameter key="use_value_type_exception" value="false"/><br>        <parameter key="except_value_type" value="time"/><br>        <parameter key="block_type" value="attribute_block"/><br>        <parameter key="use_block_type_exception" value="false"/><br>        <parameter key="except_block_type" value="value_matrix_row_start"/><br>        <parameter key="invert_selection" value="false"/><br>        <parameter key="include_special_attributes" value="false"/><br>      </operator><br>      <operator activated="true" class="set_role" compatibility="9.1.000" expanded="true" height="82" name="Set Role" width="90" x="514" y="85"><br>        <parameter key="attribute_name" value="Play"/><br>        <parameter key="target_role" value="regular"/><br>        <list key="set_additional_roles"/><br>      </operator><br>      <operator activated="true" class="concurrency:loop_attributes" compatibility="9.1.000" expanded="true" height="82" name="Loop Attributes" width="90" x="715" y="85"><br>        <parameter key="attribute_filter_type" value="all"/><br>        <parameter key="attribute" value=""/><br>        <parameter key="attributes" value=""/><br>        <parameter key="use_except_expression" value="false"/><br>        <parameter key="value_type" value="attribute_value"/><br>        <parameter key="use_value_type_exception" value="false"/><br>        <parameter key="except_value_type" value="time"/><br>        <parameter key="block_type" value="attribute_block"/><br>        <parameter key="use_block_type_exception" value="false"/><br>        <parameter key="except_block_type" value="value_matrix_row_start"/><br>        <parameter key="invert_selection" value="false"/><br>        <parameter key="include_special_attributes" value="false"/><br>        <parameter key="attribute_name_macro" value="loop_attribute"/><br>        <parameter key="reuse_results" value="false"/><br>        <parameter key="enable_parallel_execution" value="true"/><br>        <process expanded="true"><br>          <operator activated="true" class="set_role" compatibility="9.1.000" expanded="true" height="82" name="Set Role (2)" width="90" x="112" y="34"><br>            <parameter key="attribute_name" value="%{loop_attribute}"/><br>            <parameter key="target_role" value="label"/><br>            <list key="set_additional_roles"/><br>          </operator><br>          <operator activated="true" class="weight_by_information_gain" compatibility="9.1.000" expanded="true" height="82" name="Weight by Information Gain" width="90" x="380" y="34"><br>            <parameter key="normalize_weights" value="false"/><br>            <parameter key="sort_weights" value="true"/><br>            <parameter key="sort_direction" value="ascending"/><br>          </operator><br>          <operator activated="false" class="weight_by_gini_index" compatibility="9.1.000" expanded="true" height="82" name="Weight by Gini Index" width="90" x="380" y="136"><br>            <parameter key="normalize_weights" value="false"/><br>            <parameter key="sort_weights" value="true"/><br>            <parameter key="sort_direction" value="ascending"/><br>          </operator><br>          <operator activated="true" class="weights_to_data" compatibility="9.1.000" expanded="true" height="68" name="Weights to Data" width="90" x="715" y="34"/><br>          <operator activated="true" class="generate_attributes" compatibility="9.1.000" expanded="true" height="82" name="Generate Attributes" width="90" x="916" y="34"><br>            <list key="function_descriptions"><br>              <parameter key="label" value="%{loop_attribute}"/><br>            </list><br>            <parameter key="keep_all" value="true"/><br>          </operator><br>          <connect from_port="input 1" to_op="Set Role (2)" to_port="example set input"/><br>          <connect from_op="Set Role (2)" from_port="example set output" to_op="Weight by Information Gain" to_port="example set"/><br>          <connect from_op="Weight by Information Gain" from_port="weights" to_op="Weights to Data" to_port="attribute weights"/><br>          <connect from_op="Weights to Data" from_port="example set" to_op="Generate Attributes" to_port="example set input"/><br>          <connect from_op="Generate Attributes" from_port="example set output" to_port="output 1"/><br>          <portSpacing port="source_input 1" spacing="0"/><br>          <portSpacing port="source_input 2" spacing="0"/><br>          <portSpacing port="sink_output 1" spacing="0"/><br>          <portSpacing port="sink_output 2" spacing="0"/><br>        </process><br>      </operator><br>      <operator activated="true" class="append" compatibility="9.1.000" expanded="true" height="82" name="Append" width="90" x="916" y="85"><br>        <parameter key="datamanagement" value="double_array"/><br>        <parameter key="data_management" value="auto"/><br>        <parameter key="merge_type" value="all"/><br>      </operator><br>      <operator activated="true" class="blending:pivot" compatibility="9.1.000" expanded="true" height="82" name="Pivot" width="90" x="1050" y="85"><br>        <parameter key="group_by_attributes" value="label"/><br>        <parameter key="column_grouping_attribute" value="Attribute"/><br>        <list key="aggregation_attributes"><br>          <parameter key="Weight" value="average"/><br>        </list><br>        <parameter key="use_default_aggregation" value="false"/><br>        <parameter key="default_aggregation_function" value="first"/><br>      </operator><br>      <connect from_op="Retrieve Golf" from_port="output" to_op="Select Attributes" to_port="example set input"/><br>      <connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/><br>      <connect from_op="Set Role" from_port="example set output" to_op="Loop Attributes" to_port="input 1"/><br>      <connect from_op="Loop Attributes" from_port="output 1" to_op="Append" to_port="example set 1"/><br>      <connect from_op="Append" from_port="merged set" to_op="Pivot" to_port="input"/><br>      <connect from_op="Pivot" from_port="output" to_port="result 1"/><br>      <portSpacing port="source_input 1" spacing="0"/><br>      <portSpacing port="sink_result 1" spacing="0"/><br>      <portSpacing port="sink_result 2" spacing="0"/><br>    </process><br>  </operator><br></process><br><br>


    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.