The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to select two attributes from ExampleSet?
gaoxiaolei
Member Posts: 12 Contributor II
Hi, everyone.
I am newbie here. I have a question about how to select two attributes from an exampleset.
If I have an exampleset and it contains both regular attributes and label attribute, I want to select the first and the third regular attributes from it, then convert the new exampleset to double[][]. How should I do it? Dose class AttributeSelectionExampleSet can do it?
Thanks.
gaoxiaolei
I am newbie here. I have a question about how to select two attributes from an exampleset.
If I have an exampleset and it contains both regular attributes and label attribute, I want to select the first and the third regular attributes from it, then convert the new exampleset to double[][]. How should I do it? Dose class AttributeSelectionExampleSet can do it?
Thanks.
gaoxiaolei
Tagged:
0
Answers
Select Attributes
Synopsis
This operator allowes to select which attributes should be part of the resulting
Description
This operator selects which attributes of an ExampleSet should be kept and which are removed. Therefore, different filter types may be selected in the parameter attribute filter type and only attributes fulfilling this condition type are selected. The rest will be removed from the ExampleSet. There's a global switch to invert the outcome, so that all attributes which would have been originally discarded will be kept and vice versa. To invert the decision, use the invert selection parameter.
These types are available
all: Will simply select each attribute
single: This will allow you to select a single attribute name. It might be selected from the drop down box of parameter attribute if the meta data is known
subset: Let's you choose a number of attributes from a list. This will not work if no meta data is present. Each known attribute is shown in the list and might be selected.
regular_expression: This let's you specify a regular expression. Each attribute whose name matches this expression will be selected. Regular expressions are a very powerful tool but need a detailed explanation to beginners. Please refer to one of the several tutorials available on the internet for a more detailed description.
value_type: Select only attributes of a certain type. Please mention that the types are hierarchical: For example are binominal attributes nomina as well as polynominal.
block_type: Similar to value_type this let's you select the attributes depending on their block type.
no_missing_values: Will select all attributes which don't contain a missing value in any example.
numeric_value_filter: This will select the attributes by testing if all their values of all examples match this condition or if they aren't not numerical. The numeric condition might be specified by typing a numerical condition. For example the parameter string "> 6" will keep all nominal attributes and all numeric attributes having a value of greater 6 in every example. A combination of conditions is possible: "> 6 && < 11" or "<= 5 || < 0". But && and || must not be mixed.
Input
example set input: expects: ExampleSetMetaData: #examples: = 0; #attributes: 0
Output
example set output:
original:
Parameters
attribute filter type: The condition specifies which attributes are selected or affected by this operator. Range: all, single, subset, regular_expression, value_type, block_type, no_missing_values, numeric_value_filter; default: all
attribute: The attribute which should be chosen. Range: string
attributes: The attribute which should be chosen. Range: string
regular expression: A regular expression for the names of the attributes which should be kept. Range: string
use except expression: If enabled, an exception to the specified regular expression might be specified. Attributes of matching this will be filtered out, although matching the first expression. Range: boolean; default: false
except regular expression: A regular expression for the names of the attributes which should be filtered out although matching the above regular expression. Range: string
value type: The value type of the attributes. Range: attribute_value, nominal, numeric, integer, real, text, binominal, polynominal, file_path, date_time, date, time; default: attribute_value
use value type exception: If enabled, an exception to the specified value type might be specified. Attributes of this type will be filtered out, although matching the first specified type. Range: boolean; default: false
except value type: Except this value type. Range: attribute_value, nominal, numeric, integer, real, text, binominal, polynominal, file_path, date_time, date, time; default: time
block type: The block type of the attributes. Range: attribute_block, single_value, value_series, value_series_start, value_series_end, value_matrix, value_matrix_start, value_matrix_end, value_matrix_row_start; default: attribute_block
use block type exception: If enabled, an exception to the specified block type might be specified. Range: boolean; default: false
except block type: Except this block type. Range: attribute_block, single_value, value_series, value_series_start, value_series_end, value_matrix, value_matrix_start, value_matrix_end, value_matrix_row_start; default: value_matrix_row_start
numeric condition: Parameter string for the condition, e.g. '>= 5' Range: string
invert selection: Indicates if only attributes should be accepted which would normally filtered. Range: boolean; default: false
include special attributes: Indicate if this operator should also be applied on the special attributes. Otherwise they are always kept. Range: boolean; default: false
But I want to select two attributes in my own opeator. If there are many attributes in an exampleset, in the first loop the first and the second attributes are selected, and in the second loop the first and the third attributes are selected, .....
So could you show me some codes? I do not konw whether "Select Attribute"-operator can do this job.
Thanks again!
You mean the script operator?
Sure you can code it yourself, but why?
What exactly you want to achieve?
You want to find out good attribute subsets?
Like sets containing two attributes?
The forward selection operator can do that.
"This operator starts with an empty selection of attributes and, in each round, it adds each unused attribute of the given set of examples. For each added attribute, the performance is estimated using inner operators, e.g. a cross-validation. Only the attribute giving the highest increase of performance is added to the selection. Then a new round is started with the modified selection. "
Or the Optimize Selection (Brute Force) operator.
Selects the best features for an example set by trying all possible combinations of attribute selections.
here is my partly codes: Thank you wessel!
And now that I do understand your question, I don't know the answer.
Maybe you can steal some code from here:
http://www.opensourcejavaphp.net/java/rapidminer/com/rapidminer/operator/learner/PredictionModel.java.html
http://www.opensourcejavaphp.net/java/rapidminer/com/rapidminer/operator/learner/functions/neuralnet/SimpleNeuralNetLearner.java.html
import com.rapidminer.operator.Operator;
import com.rapidminer.Process;
import com.rapidminer.MacroHandler;
import com.rapidminer.tools.Ontology
ExampleSet exampleSet = operator.getInput(ExampleSet.class);
Attribute sum = AttributeFactory.createAttribute("sum", Ontology.REAL);
exampleSet.getExampleTable().addAttribute(sum);
exampleSet.getAttributes().addRegular(sum);
Attribute avg = AttributeFactory.createAttribute("avg", Ontology.REAL);
exampleSet.getExampleTable().addAttribute(avg);
exampleSet.getAttributes().addRegular(avg);
last = 0;
n = 0;
for (Example e : exampleSet) {
e["sum"] = e["abs"] + last; // iterate over an attribute using the name of the attribute
last = e["sum"];
n++;
e["avg"] = last / n;
}
Best regards,
Wessel
I hope I understood your problem correctly:
You can get all regular attributes from an ExampleSet by calling After selecting your desired attributes, you can use them to access the data via this code fragment This will return the double value for the given example and attribute (note that for nominal values this will return the internal mapping).
Now you have double values for all examples and your desired attributes and you can use them to fill your own double[][] array.
Regards,
Marco
You indeed give me some ideas!
Thank you!
gaoxiaolei