The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Problem with feature evaluation using YAGGA2
andyknownasabu
Member Posts: 3 Contributor I
Dear all,
I've managed to set up the first working RM chain for feature evaluation -
At least I think so, because I see the following error message on the console:
scheme makes sense at all? I'd highly appreciate to hear from your experience and concerning how to improve the above process.
Thank you very much and best regards!
I've managed to set up the first working RM chain for feature evaluation -
At least I think so, because I see the following error message on the console:
G Feb 4, 2009 9:15:11 AM: [Warning] Cannot generate test attribute: No such attribute: corr. We just keep both attributes foThe chain looks as follows:
r sure...
Last message repeated 2 times.
<?xml version="1.0" encoding="US-ASCII"?>Can anybody explain to me why this error occurs, what it means, how to fix it (if possible) and in general if the above
<process version="4.3">
<operator name="Root" class="Process" expanded="yes">
<operator name="Data Source" class="ArffExampleSource">
<parameter key="data_file" value="all_subjects.arff"/>
<parameter key="id_attribute" value="id"/>
<parameter key="label_attribute" value="label"/>
</operator>
<operator name="YAGGA2" class="YAGGA2" expanded="yes">
<parameter key="use_diff" value="true"/>
<parameter key="use_max" value="true"/>
<parameter key="use_min" value="true"/>
<parameter key="use_sin" value="false"/>
<parameter key="use_square_roots" value="true"/>
<operator name="SimpleValidation" class="SimpleValidation" expanded="yes">
<parameter key="create_complete_model" value="true"/>
<operator name="DecisionTree" class="DecisionTree">
<parameter key="criterion" value="gini_index"/>
<parameter key="maximal_depth" value="5"/>
</operator>
<operator name="Applier Chain" class="OperatorChain" expanded="yes">
<operator name="Test" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="keep_model" value="true"/>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="keep_example_set" value="true"/>
<parameter key="root_mean_squared_error" value="true"/>
<parameter key="root_relative_squared_error" value="true"/>
<parameter key="weighted_mean_precision" value="true"/>
<parameter key="weighted_mean_recall" value="true"/>
</operator>
</operator>
</operator>
<operator name="ProcessLog" class="ProcessLog">
<parameter key="filename" value="process_log.txt"/>
<list key="log">
<parameter key="Generation" value="operator.YAGGA2.value.generation"/>
<parameter key="Recall" value="operator.ClassificationPerformance.value.weighted_mean_recall"/>
<parameter key="Precision" value="operator.ClassificationPerformance.value.weighted_mean_precision"/>
</list>
</operator>
</operator>
<operator name="AttributeWeightsWriter" class="AttributeWeightsWriter">
<parameter key="attribute_weights_file" value="attribute.wgt"/>
</operator>
<operator name="PerformanceWriter" class="PerformanceWriter">
<parameter key="performance_file" value="performance.per"/>
</operator>
<operator name="AttributeConstructionsWriter" class="AttributeConstructionsWriter">
<parameter key="attribute_constructions_file" value="attribute.cst"/>
</operator>
</operator>
</process>
scheme makes sense at all? I'd highly appreciate to hear from your experience and concerning how to improve the above process.
Thank you very much and best regards!
0
Answers
G Aug 17, 2009 9:00:00 PM: [Warning] exp: Infinite value generated, replaced by NaN.
G Aug 17, 2009 9:00:00 PM: [Warning] exp: NaN generated.
G Aug 17, 2009 9:00:00 PM: [Warning] 1/: NaN generated.
Last message repeated 5 times.
G Aug 17, 2009 9:00:00 PM: [Warning] /: Infinite value generated.
Last message repeated 105 times.
....
G Aug 17, 2009 9:02:13 PM: [Warning] Cannot generate test attribute: No such attribute: BB202CBas / LRAll1CUpr2. We just keep both attributes for sure...
Last message repeated 20880 times.
I think it is a bug with YAGGA2 since I don't have the problem when I replace YAGGA2 with YAGGA. It seems that, when there is a problem with a generated attribute (NaN or infinity error) then the code doesn't deal with the situation gracefully.
This is my XML:
<operator name="Root" class="Process" expanded="yes">
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="filename" value="Minutes.csv"/>
<parameter key="label_name" value="RRRatio"/>
<parameter key="id_name" value="id"/>
<parameter key="sample_ratio" value="0.05"/>
</operator>
<operator name="AttributeFilter" class="AttributeFilter">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="parameter_string" value="Symbol|Bar|BarDate|BarTime|HighestHighAfter|HighestLowAfter|LowestHighAfter|LowestLowAfter|VWAvgHighAfter|VWAvgLowAfter|MaxLongLoss|MaxShortLoss|LongAvgProfit|ShortAvgProfit"/>
<parameter key="invert_filter" value="true"/>
</operator>
<operator name="YAGGA2" class="YAGGA2" expanded="yes">
<parameter key="population_size" value="100"/>
<parameter key="maximum_number_of_generations" value="1000"/>
<parameter key="generations_without_improval" value="10"/>
<parameter key="p_initialize" value="0.1"/>
<parameter key="use_plus" value="false"/>
<parameter key="use_diff" value="true"/>
<parameter key="use_div" value="true"/>
<parameter key="use_square_roots" value="true"/>
<parameter key="use_sin" value="false"/>
<parameter key="use_log" value="true"/>
<parameter key="use_absolute_values" value="false"/>
<parameter key="constant_generation_prob" value="0.0"/>
<operator name="SimpleValidation" class="SimpleValidation" expanded="yes">
<parameter key="local_random_seed" value="10"/>
<operator name="W-REPTree" class="W-REPTree">
<parameter key="M" value="1000.0"/>
</operator>
<operator name="ApplierChain" class="OperatorChain" expanded="yes">
<operator name="Applier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="RegressionPerformance" class="RegressionPerformance">
<parameter key="keep_example_set" value="true"/>
<parameter key="spearman_rho" value="true"/>
<parameter key="use_example_weights" value="false"/>
</operator>
</operator>
</operator>
<operator name="ProcessLog" class="ProcessLog">
<list key="log">
<parameter key="generation" value="operator.YAGGA2.value.generation"/>
<parameter key="performance" value="operator.YAGGA2.value.performance"/>
<parameter key="best" value="operator.YAGGA2.value.best"/>
</list>
</operator>
</operator>
<operator name="AttributeConstructionsWriter" class="AttributeConstructionsWriter" breakpoints="after">
<parameter key="attribute_constructions_file" value="MinuteYagga100.att"/>
</operator>
<operator name="AttributeWeightsWriter" class="AttributeWeightsWriter" breakpoints="after">
<parameter key="attribute_weights_file" value="MinuteYagga100.wgt"/>
</operator>
</operator>
there is no way to "deal" gracefully when you divide by zero or exceed the maximal range of a double value. At least the code is gracefull enough to say whats the problem: In your case exp, 1/ and / causes these errors, because you have very larg numbers and zeros in your dataset. So if you turn of these generating functions in the parameters, the problem will vanish.
The problem does not occur in YAGGA, because it simply does not allow to construct such attributes...
Greetings,
Sebastian
However, the operator seems to repeatly create the same "dangerous" features ... and therefore the entire process gets bogged down in outputing tens of thousands of warning messages. Perhaps it could remember which feature combinations were dangerous?
Also, when I use YAGGA, it never has generated a "new" attribute (even though I have all the boxes checked (addition, division, reciprocal, etc.). I've had create roughly 10,000 new feature combination but I've never seem a "gensym" attribute output (or even when I stop the process and examine the current attributes being evaluated). I haven't seen any other postings on the forums about this problem, but I just can't figure why I'd be having it. All I do is switch back and forth between YAGGA and YAGGA2 using the "replace operator" GUI command, so the XML doesn't really change. Any ideas? I'd post the XML but it is just the same as what I'd posted before with "YAGGA2" replaced with "YAGGA".
LG,
John
it will probably speed up your process, if you select a higher logverbosity in the root operator, so that these hundreds and thousands of warnings aren't displayed any more.
Did you make breakpoints before the xvalidation inside YAGGA to ckeck if any new attributes got generated?
Greetings,
Sebastian