The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Normalized mutual information matrix
Goodevening, i tried to calculate a normalized Mutual Information Matrix by passing my Data through a normalize operator, set as minmax (0-1) as follows:
<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_csv" compatibility="7.5.001" expanded="true" height="68" name="Read CSV" width="90" x="45" y="34">
<parameter key="csv_file" value="C:\Users\ThomasOtt\Downloads\AccXYZ.csv"/>
<parameter key="column_separators" value=","/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information"/>
</operator>
<operator activated="true" class="k_means" compatibility="7.5.001" expanded="true" height="82" name="Clustering" width="90" x="179" y="34">
<parameter key="k" value="5"/>
</operator>
<operator activated="true" class="extract_prototypes" compatibility="7.5.001" expanded="true" height="82" name="Extract Cluster Prototypes" width="90" x="313" y="34"/>
<operator activated="true" class="mututal_information_matrix" compatibility="7.5.001" expanded="true" height="82" name="Mutual Information Matrix" width="90" x="447" y="34"/>
<connect from_op="Read CSV" from_port="output" to_op="Clustering" to_port="example set"/>
<connect from_op="Clustering" from_port="cluster model" to_op="Extract Cluster Prototypes" to_port="model"/>
<connect from_op="Extract Cluster Prototypes" from_port="example set" to_op="Mutual Information Matrix" to_port="example set"/>
<connect from_op="Mutual Information Matrix" from_port="example set" to_port="result 1"/>
<connect from_op="Mutual Information Matrix" from_port="matrix" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
the output is sent to the mutual information matrix, however the outuput in the matrix is not normalized inside the 0-1 range, what am i missing?
thanks in advance
0
Answers
Hallo Mp95,
It is impossible to deploy your XML file: it is probably due to the line:
Maerkli
Thanks for your time, as i'm new to the software i pasted the wrong xml of the project, it is as follows:
Also i've attached the dataset which is of public domain anyways.
The normalization process of the input data goes as planned ( i checked on results) however when i give the normalized data as an input for the mutual information matrix, it doesn't lie in the range of 0-1 but gives me the same matrix that i would have gotten if i didn't do the normalization at all, am i missing something?
thanks
cc'ing @Thomas_Ott as this looks like one of his processes
[Note from moderator: I HIGHLY recommend upgrading your RapidMiner Studio from 7.5 to the current version!]
Scott
Goededag Mp95,
Thanks for having posted another XML file: I am now able to reproduce your RapidMiner project. As far as I can understand, Mutual information is not bound to 0-1 but to 0 to +∞. RapidMiner 8 Operator reference writes:
''Mutual information is one of many quantities that measures how much one attribute tells us about another. It is a dimensionless quantity, and can be thought of as the reduction in uncertainty about one attribute given the knowledge of another. High mutual information indicates a large reduction in uncertainty; low mutual information indicates a small reduction; and zero
mutual information between two attribute means the variables are independent.''
If you observe the variable Orientation in Correlation Matrix and in Mutual Information Matrix, you can see that this variable is almost not correlated with other variables.
Please, take my response with care for I am not a data scientist.
Maerkli.
@sgenzer I don't remember this process, could be a left over from another process.
Hello
I want to use nmi to evaluate clusters. Does the Mutual Information Matrix operator calculate the same nmi?
Someone has a typical process?
Thank you
With respect
Bonjour,
Is it possible to reformulate your question, please?
Thanks,
Maerkli