The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Bug in ProcessLog with FeatureSelectionParallel?"

keithkeith Member Posts: 157 Maven
edited May 2019 in Help
When I try to log the performance (rms error) of a model inside a feature selection process, it only works if I am using the single-threaded FeatureSelection node, not the parallel-processing-enabled FeatureSelectionParallel available in RM Enterprise.

In the two examples below, the ProcessLog in the former produces valid performance data, but the latter has all '?' in the performance column.

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function" value="non linear"/>
        <parameter key="number_examples" value="1000"/>
    </operator>
    <operator name="AttributeConstruction" class="AttributeConstruction">
        <list key="function_descriptions">
          <parameter key="att_random1" value="rand()"/>
          <parameter key="att_random2" value="rand()"/>
          <parameter key="att_constant" value="0"/>
        </list>
    </operator>
    <operator name="FeatureSelection" class="FeatureSelection" expanded="yes">
        <parameter key="generations_without_improval" value="2"/>
        <operator name="W-LWL" class="W-LWL">
            <parameter key="keep_example_set" value="true"/>
            <parameter key="K" value="5.0"/>
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="RegressionPerformance" class="RegressionPerformance">
                <parameter key="root_mean_squared_error" value="true"/>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <list key="log">
                  <parameter key="generation" value="operator.FeatureSelection.value.generation"/>
                  <parameter key="feature_name" value="operator.FeatureSelection.value.feature_names"/>
                  <parameter key="performance" value="operator.RegressionPerformance.value.root_mean_squared_error"/>
                </list>
            </operator>
        </operator>
    </operator>
</operator>

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function" value="non linear"/>
        <parameter key="number_examples" value="1000"/>
    </operator>
    <operator name="AttributeConstruction" class="AttributeConstruction">
        <list key="function_descriptions">
          <parameter key="att_random1" value="rand()"/>
          <parameter key="att_random2" value="rand()"/>
          <parameter key="att_constant" value="0"/>
        </list>
    </operator>
    <operator name="FeatureSelectionParallel" class="FeatureSelectionParallel" expanded="yes">
        <operator name="W-LWL" class="W-LWL">
            <parameter key="keep_example_set" value="true"/>
            <parameter key="K" value="5.0"/>
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="RegressionPerformance" class="RegressionPerformance">
                <parameter key="root_mean_squared_error" value="true"/>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <list key="log">
                  <parameter key="generation" value="operator.FeatureSelectionParallel.value.generation"/>
                  <parameter key="feature_name" value="operator.FeatureSelectionParallel.value.feature_names"/>
                  <parameter key="performance" value="operator.RegressionPerformance.value.root_mean_squared_error"/>
                </list>
            </operator>
        </operator>
    </operator>
</operator>
Tagged:

Answers

  • haddockhaddock Member Posts: 849 Maven
    Hi Keith,

    It's not so much of a bug as a buggeration, I ran across it in January doing parameter iteration in parallel and moaned; I did get a very clear explanation of why the limitation is there ( although I do think it should be removed for those who cough up loot ), and here is the reference...

    http://rapid-i.com/rapidforum/index.php/topic,563.msg2146.html#msg2146

    Not sure that is what you want to hear, but there it is.

  • keithkeith Member Posts: 157 Maven
    Thanks haddock.  That is informative, if disappointing.  I guess I'll have to choose between monitoring progress and maximum performance until RM 5.0 is ready.
Sign In or Register to comment.