The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Multiple Labels for Binary Classification Problems in one model
rodienne_zammit
Member Posts: 3 Contributor I
Hello,
sgenzer I read them to the end!
Current approach: Separate models for different binary labels
I have prepared a decision tree model which correctly predicts a binary label for product A when Product A is used as the label.
For product B, I re-run the process to train a similar model when B is the label.
Then I would need to train another model to predict product C, and use C as a label. This goes on as in reality I have more products.
Desired approach: One model to predict different binary labels
Is there a way I can combine this into one model so that the model can tell me the binary predictions (true/false) for Product A, B and C in one go? This would be ideal when applying the model on new data so that I don't need to run all separate product models.
I tried to use "loop label" however this loops on the labels to create different models, and I did not find a way of how to use the models created to apply them to new data. I did not find a way how I could loop label on new data to apply "loop model" (this deosn't exist).
Maybe I could achieve this by combining the different binary classification values into one value?
Appreciate feedback on how it is best to implement this problem.
Thank you!
sgenzer I read them to the end!
Current approach: Separate models for different binary labels
I have prepared a decision tree model which correctly predicts a binary label for product A when Product A is used as the label.
For product B, I re-run the process to train a similar model when B is the label.
Then I would need to train another model to predict product C, and use C as a label. This goes on as in reality I have more products.
Desired approach: One model to predict different binary labels
Is there a way I can combine this into one model so that the model can tell me the binary predictions (true/false) for Product A, B and C in one go? This would be ideal when applying the model on new data so that I don't need to run all separate product models.
I tried to use "loop label" however this loops on the labels to create different models, and I did not find a way of how to use the models created to apply them to new data. I did not find a way how I could loop label on new data to apply "loop model" (this deosn't exist).
Maybe I could achieve this by combining the different binary classification values into one value?
Appreciate feedback on how it is best to implement this problem.
Thank you!
1
Best Answer
-
rodienne_zammit Member Posts: 3 Contributor IThanks a lot @mschmitz for putting me on the right track. I looked into Polynominal by Binominal classification but I didn't manage to get what I want with it.
There might be other ways of doing this, but ..
I got the desired approach by looping on the product attributes using "Loop Attributes", this gives a macro name to the label, then inside the loop I set the field %{loop_attribute} as the label, and saved the model using the product name in the file name of the output, for example, save model as "C:\Documents\%{loop_attribute}.mod. I also used the "Annotate" operator with the performance and model output so that I can refer to the Annotation on the results and know which product the performance relates to.
To read and apply models on new data I used again the "Loop Label" and set the role of the product inside the loop and read the model from the file by using the macro value %{loop_label}. Again applying Annotate to the performance output helps me recognise which performance I am looking at.
XML sample for reading below:<div><?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"></div><div> <context></div><div> <input/></div><div> <output/></div><div> <macros/></div><div> </context></div><div> <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"></div><div> <parameter key="logverbosity" value="init"/></div><div> <parameter key="random_seed" value="2001"/></div><div> <parameter key="send_mail" value="never"/></div><div> <parameter key="notification_email" value=""/></div><div> <parameter key="process_duration_for_mail" value="30"/></div><div> <parameter key="encoding" value="SYSTEM"/></div><div> <process expanded="true"></div><div> <operator activated="true" class="retrieve" compatibility="9.2.000" expanded="true" height="68" name="Retrieve Products" width="90" x="112" y="34"></div><div> <parameter key="repository_entry" value="//Samples/data/Products"/></div><div> </operator></div><div> <operator activated="true" class="concurrency:loop_attributes" compatibility="9.2.000" expanded="true" height="103" name="Loop Attributes" width="90" x="313" y="34"></div><div> <parameter key="attribute_filter_type" value="subset"/></div><div> <parameter key="attribute" value=""/></div><div> <parameter key="attributes" value="Product ID"/></div><div> <parameter key="use_except_expression" value="false"/></div><div> <parameter key="value_type" value="attribute_value"/></div><div> <parameter key="use_value_type_exception" value="false"/></div><div> <parameter key="except_value_type" value="time"/></div><div> <parameter key="block_type" value="attribute_block"/></div><div> <parameter key="use_block_type_exception" value="false"/></div><div> <parameter key="except_block_type" value="value_matrix_row_start"/></div><div> <parameter key="invert_selection" value="false"/></div><div> <parameter key="include_special_attributes" value="false"/></div><div> <parameter key="attribute_name_macro" value="loop_attribute"/></div><div> <parameter key="reuse_results" value="false"/></div><div> <parameter key="enable_parallel_execution" value="true"/></div><div> <process expanded="true"></div><div> <operator activated="true" class="set_role" compatibility="9.2.000" expanded="true" height="82" name="Set Role" width="90" x="45" y="34"></div><div> <parameter key="attribute_name" value="%{loop_attribute}"/></div><div> <parameter key="target_role" value="label"/></div><div> <list key="set_additional_roles"/></div><div> </operator></div><div> <operator activated="true" class="legacy:read_model" compatibility="9.2.000" expanded="true" height="68" name="Read Model" width="90" x="45" y="136"></div><div> <parameter key="model_file" value="%{loop_attribute}_NewFeatures.mod"/></div><div> </operator></div><div> <operator activated="true" class="apply_model" compatibility="9.2.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="179" y="85"></div><div> <list key="application_parameters"/></div><div> <parameter key="create_view" value="false"/></div><div> </operator></div><div> <operator activated="true" class="annotate" compatibility="9.2.000" expanded="true" height="68" name="Annotate" width="90" x="313" y="85"></div><div> <list key="annotations"></div><div> <parameter key="Product" value="%{loop_attribute}"/></div><div> </list></div><div> <parameter key="duplicate_annotations" value="overwrite"/></div><div> </operator></div><div> <operator activated="true" class="performance_binominal_classification" compatibility="9.2.000" expanded="true" height="82" name="Performance (Test Set)" width="90" x="447" y="85"></div><div> <parameter key="main_criterion" value="first"/></div><div> <parameter key="accuracy" value="true"/></div><div> <parameter key="classification_error" value="false"/></div><div> <parameter key="kappa" value="false"/></div><div> <parameter key="AUC (optimistic)" value="false"/></div><div> <parameter key="AUC" value="false"/></div><div> <parameter key="AUC (pessimistic)" value="false"/></div><div> <parameter key="precision" value="false"/></div><div> <parameter key="recall" value="false"/></div><div> <parameter key="lift" value="false"/></div><div> <parameter key="fallout" value="false"/></div><div> <parameter key="f_measure" value="false"/></div><div> <parameter key="false_positive" value="false"/></div><div> <parameter key="false_negative" value="false"/></div><div> <parameter key="true_positive" value="false"/></div><div> <parameter key="true_negative" value="false"/></div><div> <parameter key="sensitivity" value="false"/></div><div> <parameter key="specificity" value="false"/></div><div> <parameter key="youden" value="false"/></div><div> <parameter key="positive_predictive_value" value="false"/></div><div> <parameter key="negative_predictive_value" value="false"/></div><div> <parameter key="psep" value="false"/></div><div> <parameter key="skip_undefined_labels" value="true"/></div><div> <parameter key="use_example_weights" value="true"/></div><div> </operator></div><div> <operator activated="true" class="annotate" compatibility="9.2.000" expanded="true" height="68" name="Annotate (2)" width="90" x="581" y="34"></div><div> <list key="annotations"></div><div> <parameter key="Product" value="%{loop_attribute}"/></div><div> </list></div><div> <parameter key="duplicate_annotations" value="overwrite"/></div><div> </operator></div><div> <connect from_port="input 1" to_op="Set Role" to_port="example set input"/></div><div> <connect from_op="Set Role" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/></div><div> <connect from_op="Read Model" from_port="output" to_op="Apply Model (2)" to_port="model"/></div><div> <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Annotate" to_port="input"/></div><div> <connect from_op="Annotate" from_port="output" to_op="Performance (Test Set)" to_port="labelled data"/></div><div> <connect from_op="Performance (Test Set)" from_port="performance" to_op="Annotate (2)" to_port="input"/></div><div> <connect from_op="Performance (Test Set)" from_port="example set" to_port="output 2"/></div><div> <connect from_op="Annotate (2)" from_port="output" to_port="output 1"/></div><div> <portSpacing port="source_input 1" spacing="0"/></div><div> <portSpacing port="source_input 2" spacing="0"/></div><div> <portSpacing port="sink_output 1" spacing="0"/></div><div> <portSpacing port="sink_output 2" spacing="0"/></div><div> <portSpacing port="sink_output 3" spacing="0"/></div><div> </process></div><div> <description align="center" color="transparent" colored="false" width="126">I looped on attribute because all my products were in a different column</description></div><div> </operator></div><div> <connect from_op="Retrieve Products" from_port="output" to_op="Loop Attributes" to_port="input 1"/></div><div> <connect from_op="Loop Attributes" from_port="output 1" to_port="result 1"/></div><div> <connect from_op="Loop Attributes" from_port="output 2" to_port="result 2"/></div><div> <portSpacing port="source_input 1" spacing="0"/></div><div> <portSpacing port="sink_result 1" spacing="0"/></div><div> <portSpacing port="sink_result 2" spacing="0"/></div><div> <portSpacing port="sink_result 3" spacing="0"/></div><div> </process></div><div> </operator></div><div></process></div>
2
Answers
Dortmund, Germany