Outlier Detection

annica · December 2009

Hello
I try to use an outlier detection like distance based outlier detection.
I thought that, if i apply this function this outliers are going to be ignored for further calculations.
But if I apply this function or not; I can not see any differences.
this is my xml part:
<operator name="Root" class="Process" expanded="yes">
<parameter key="logverbosity" value="warning"/>
<operator name="Red Wine Example Data" class="ExampleSource">
<parameter key="attributes" value="/home/annica/rm_workspace/projekt/wine.aml"/>
</operator>
<operator name="DistanceBasedOutlierDetection" class="DistanceBasedOutlierDetection">
<parameter key="number_of_neighbors" value="2"/>
<parameter key="number_of_outliers" value="14"/>
</operator>
<operator name="XValidation" class="XValidation" expanded="yes">
<operator name="Training" class="OperatorChain" expanded="yes">
<operator name="W-SMO" class="W-SMO">
</operator>
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="/home/annica/rm_workspace/wineModel.mod"/>
</operator>
</operator>
<operator name="Testing" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="Performance" class="Performance">
</operator>
</operator>
</operator>
</operator>

Is there something which I do in a wrong way?
Thanks for help
Annica

haddock · December 2009

Hi there,

I thought that, if i apply this function this outliers are going to be ignored for further calculations

.

If you want to save time it is really important in RM to read what little documentation is provided, and not to make assumptions. Actually this operator does not filter out the outliers, it just adds an attribute to indicate whether each example is an outlier, much as the info for the operator indicates..

The Operator takes an example set and passes it on with an boolean top-n D^k outlier status in a new boolean-valued special outlier attribute indicating true (outlier) and false (no outlier).

To see the point, just put a break after your outlier detection operator, and you'll see the new column. I realise that the jargon may all seem a bit confusing, but it does get easier

earmijo · December 2009

Just add an "ExampleFilter" operator after the Oulier Detection operator:

<operator name="ExampleFilter" class="ExampleFilter" breakpoints="after">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="Outlier=false"/>
</operator>

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Outlier Detection

Answers