The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Remove correlated attributes delivers strange result"
Hi there,
I am (unfortunately) not an expert in correlations calculation but the result of this sample process seems strange to me.
First I run the code as is. --> Result includes all attributes
Then I change the parameter "filter relation" to property "less". --> Result still includes att1
To my understanding att1 can either have a correlation greater than 0.9 OR less than 0.9 but it cannot appear in both results...
Best regards
Sachs
I am (unfortunately) not an expert in correlations calculation but the result of this sample process seems strange to me.
First I run the code as is. --> Result includes all attributes
Then I change the parameter "filter relation" to property "less". --> Result still includes att1
To my understanding att1 can either have a correlation greater than 0.9 OR less than 0.9 but it cannot appear in both results...
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.000" expanded="true" name="Root">
<process expanded="true" height="512" width="640">
<operator activated="true" class="generate_data" compatibility="5.2.003" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
<operator activated="true" class="remove_correlated_attributes" compatibility="5.2.003" expanded="true" height="76" name="Remove Correlated Attributes" width="90" x="179" y="30">
<parameter key="correlation" value="0.9"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Remove Correlated Attributes" to_port="example set input"/>
<connect from_op="Remove Correlated Attributes" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Best regards
Sachs
Tagged:
0
Answers
I just double checked on another machine. Works perfectly fine... except that I have no clue on what's happing in the background (as described above)...
Odd, I ran twice before posting, but now it works as you say; you'd expect random data to get cleaned out, but it doesn't. The reason for this is noted in the help... Err, yes, well
Being able to read helps a lot... stupid me...
Though, I have to admit that I don't fully understand the content of the explanation. So in the end I am not able to use this operator as I don't know in what cases attributes are removed correctly?
I am wondering then what the inteded use scenario is like?
Anyway, thanks a lot
Sachs
Agreed, the explanation is a bit obscure, but at least there is one. On the other hand bear in mind..
1. It doesn't remove falsely, it may not remove completely, in that some may remain.
2. The use scenario may be different from mining random data for 90% correlation!
3. There are alternative dimension reducers.
4. Some learners, like SVMs, handle high dimensionality rather well.
Best
H
I have a problem about 'Remove Correlated Attributes' operator.
I have 91 attribues. This operator just remove 5 attributes from my dataset, even I set threshold parameter to 0.01.
What should I do?
hello @mosiomohsen - so I guess my first question is whether or not you are certain that you have truly independent variables? Perhaps they truly are correlated?
If not, I'd recommend posting your XML process here (see "Read Before Posting" on right when you reply) and attach your dataset. This way we can replicate what you're doing and help you better.
Scott