The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Write filters to disk"
The operator ModelGrouper is a convenient solution if some preprocessing and predictions models must be
simultaneously written to disk. A data mining process also often contains some filters like the
"FeatureNameFilter" operator which are however not written to disk when the ModelWriter is used.
In the following code, is there a way to also dump the "FeatureNameFilter" into a file such that the complete
process can be later read in and be applied on unseen data?
simultaneously written to disk. A data mining process also often contains some filters like the
"FeatureNameFilter" operator which are however not written to disk when the ModelWriter is used.
In the following code, is there a way to also dump the "FeatureNameFilter" into a file such that the complete
process can be later read in and be applied on unseen data?
<?xml version="1.0" encoding="US-ASCII"?>
<process version="4.4">
<operator name="Root" class="Process" expanded="yes">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="encoding" value="SYSTEM"/>
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="polynomial classification"/>
<parameter key="number_examples" value="100"/>
<parameter key="number_of_attributes" value="5"/>
<parameter key="attributes_lower_bound" value="-10.0"/>
<parameter key="attributes_upper_bound" value="10.0"/>
<parameter key="local_random_seed" value="-1"/>
<parameter key="datamanagement" value="double_array"/>
</operator>
<operator name="NoiseGenerator" class="NoiseGenerator">
<parameter key="random_attributes" value="3"/>
<parameter key="label_noise" value="0.05"/>
<parameter key="default_attribute_noise" value="0.0"/>
<list key="noise">
</list>
<parameter key="offset" value="0.0"/>
<parameter key="linear_factor" value="1.0"/>
<parameter key="local_random_seed" value="-1"/>
</operator>
<operator name="Normalization" class="Normalization">
<parameter key="return_preprocessing_model" value="true"/>
<parameter key="create_view" value="false"/>
<parameter key="method" value="Z-Transformation"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="1.0"/>
</operator>
<operator name="FeatureNameFilter" class="FeatureNameFilter">
<parameter key="filter_special_features" value="false"/>
<parameter key="skip_features_with_name" value="result"/>
</operator>
<operator name="NearestNeighbors" class="NearestNeighbors">
<parameter key="keep_example_set" value="false"/>
<parameter key="k" value="3"/>
<parameter key="weighted_vote" value="false"/>
<parameter key="measure_types" value="MixedMeasures"/>
<parameter key="mixed_measure" value="MixedEuclideanDistance"/>
<parameter key="nominal_measure" value="NominalDistance"/>
<parameter key="numerical_measure" value="EuclideanDistance"/>
<parameter key="divergence" value="GeneralizedIDivergence"/>
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="1.0"/>
<parameter key="kernel_sigma1" value="1.0"/>
<parameter key="kernel_sigma2" value="0.0"/>
<parameter key="kernel_sigma3" value="2.0"/>
<parameter key="kernel_degree" value="3.0"/>
<parameter key="kernel_shift" value="1.0"/>
<parameter key="kernel_a" value="1.0"/>
<parameter key="kernel_b" value="0.0"/>
</operator>
<operator name="ModelGrouper" class="ModelGrouper">
</operator>
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="combined_model_bin.mod"/>
<parameter key="overwrite_existing_file" value="true"/>
<parameter key="output_type" value="XML"/>
</operator>
</operator>
</process>
Tagged:
0
Answers
this unfortunately is not possible. You still have to design a process for application. But you could use a trick for simplifying this:
If you store all the preprocessing stuff in a single process, you might load and apply it in both the training process as well as in the apply process using the process embedder. Then this process behaves like a modell itself.
Greetings,
Sebastian