Change the domain of an attribute
Let's say I have a column 'Fruit' that can take the values 'a', 'b' or 'c':
ID, Fruit
1, a
2, b
3, c
4, c
and then I remove all 'Fruit' values that are equal to 'c' with a filter so that only 'a':s and 'b':s remain:
ID, Fruit
1, a
2, b
If I now go to Results -> Statistics, and look at the Values for the Fruit attribute, it will tell me
a(1), b(1), c(0).
This means the domain (i.e. the possible values the Fruit attribute can take, don't know what domain is called with RM nomenclature...) is [a, b, c]. How do I change the domain to be [a, b]? (I don't really need c anymore!)
My current workaround has been to write to a csv file after the filtering, and then read from this new csv file. But I suppose this is possible to do in a more elegant way with an operator...
Best Answer
-
IngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
Hi,
Yes, there is an operator for that indeed :-) It is called "Remove Unused Values". The process below shows a little example.
Hope this helps,
Ingo
<?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="9.0.002" expanded="true" height="68" name="Retrieve Titanic" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Samples/data/Titanic"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="9.0.002" expanded="true" height="103" name="Filter Examples" width="90" x="179" y="34">
<list key="filters_list">
<parameter key="filters_entry_key" value="Passenger Class.is_in.First;Second"/>
</list>
</operator>
<operator activated="true" class="remove_unused_values" compatibility="9.0.002" expanded="true" height="103" name="Remove Unused Values" width="90" x="313" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Passenger Class"/>
</operator>
<connect from_op="Retrieve Titanic" from_port="output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Remove Unused Values" to_port="example set input"/>
<connect from_op="Remove Unused Values" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>2