The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
OutlierDistanceBasedDetection
Hi,
I have an exampleSet with different attributes and I would like to apply Outlier in every numerical attribute but separately is this possible? which is the best way to do ? I can apply outlier over the whole table but i would like to do it in every attribute, for example:
heigth weigth ..... // more attributtes
188 80
185 150
186 83
189 89
190 87
192 86
145 88
I would like to get 145 (heigth) and 150 (weigth) separately ... [Probably a process for each attribute applying DBoutlierOperator would be a solution but not efficient...]
DBOutlierOperator(OperatorDescription description) is not applyable for an attribute of an exampleSet. Probably AttributeSelectionExampleSet which filters what attributes I want in exampleset would be useful but how to apply the Outlier function for each attribute?
thanks
I have an exampleSet with different attributes and I would like to apply Outlier in every numerical attribute but separately is this possible? which is the best way to do ? I can apply outlier over the whole table but i would like to do it in every attribute, for example:
heigth weigth ..... // more attributtes
188 80
185 150
186 83
189 89
190 87
192 86
145 88
I would like to get 145 (heigth) and 150 (weigth) separately ... [Probably a process for each attribute applying DBoutlierOperator would be a solution but not efficient...]
DBOutlierOperator(OperatorDescription description) is not applyable for an attribute of an exampleSet. Probably AttributeSelectionExampleSet which filters what attributes I want in exampleset would be useful but how to apply the Outlier function for each attribute?
thanks
0
Answers
you could use a combination of the feature iterator and the Attribute Subset Preprocessing, which will deliver only a subset of the exampleset's attributes to its child operators. Since this is a complex process, I will post a sample below:
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="sum classification"/>
</operator>
<operator name="IOStorer" class="IOStorer">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
<parameter key="remove_from_process" value="false"/>
</operator>
<operator name="FeatureIterator" class="FeatureIterator" expanded="yes">
<parameter key="work_on_input" value="false"/>
<operator name="IORetriever" class="IORetriever">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="parameter_string" value="att5"/>
<parameter key="attribute_name_regex" value="%{loop_feature}"/>
<operator name="DetectionOnSingleAttribute" class="DensityBasedOutlierDetection">
<parameter key="distance" value="1.0"/>
<parameter key="proportion" value="0.5"/>
</operator>
<operator name="DoingSomething" class="OperatorChain" expanded="yes">
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="Outlier"/>
</operator>
<operator name="ChangeAttributeName" class="ChangeAttributeName">
<parameter key="old_name" value="Outlier"/>
<parameter key="new_name" value="Outlier_%{loop_feature}"/>
</operator>
</operator>
</operator>
<operator name="IOStorer (2)" class="IOStorer">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
<parameter key="remove_from_process" value="false"/>
</operator>
</operator>
<operator name="IOConsumer" class="IOConsumer">
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator name="IORetriever (2)" class="IORetriever">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
</operator>
</operator>
The problem is the behavior of the FeatureIterator, which will not deliver the changed exampleset after finishing. That's why we have to use the IOStore and IORetrieve operators to save the generated ExampleSet on our own. We actually only need the macro defined by the FeatureIterator giving us every regular attribute name, so that we can use it in the attributeSubsetPreprocessing condition.
This sample only renames the attributes, but you very well might do something more intelligent like unification of the results of each attribute using an attributeConstruction, or something else.
Greetings,
Sebastian
now I have detected outliers but now I need to get the individual results.
I have a table but i have to select atributes values where otliers is true.
An sql statement would "select from table where outlier_vel=true" but what I have are the reults from the process (are in a exampleset) and I cannot make a query like sql...
which is the best way to query in the exampleset results (applying filters)?
thanks a lot!
did you try the ExampleFilter? It allows several conditions for filtering examples from the set.
Greetings,
Sebastian
tackling a similiar problem. Tried the "Filter Examples" operator of Version 5.0.
No matter how I set the parameter string (outlier=false / outlier=true), the result set is empty.
<operator activated="true" class="filter_examples" expanded="true" height="76" name="Filter Examples" width="90" x="313" y="435">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="outlier=false"/>
</operator>
Maybe some syntax problem?
Greetings,
Chris
LOF produces a number, rather than a boolean - r-Click on operator, then F1->Description produces ... .
thanks for the hint. It works with LOF (filtering for "outlier < 1").
So the question is: what is the correct syntax with boolean filter parameters?
Cheers,
Chris
There is also a filter deployed filtering for "outlier=false". And it works.
So I rebuilt my workflow one more time. And now it works. Cannot identiy any differences...
Probably a case where the problem is in front of the screen one more time.
Thanks for your help anyway!