The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
I have a data set and I am trying KNN for that but I am not getting it, can u please help me?
Best Answers
-
varunm1 Member Posts: 1,207 UnicornHello @HIMANI
It is not easy to create a process without any information like what kind of problem you are trying to solve (Classification or Regression)? What is your target label (Output)? What kind of validation do you need (Split or Cross Validation)? There are many things to do while developing a model. I created a K-NN model that does classification with an output label "Sex". I see your data has missing values, I used impute missing values operator to replace missing values with data. Finally, I selected a 5 fold cross-validation method to train and test data. Below is the XML code for the process (Click on SHOW). To use this XML code, you need to open a new process and then open XML window (View --> Show Panel --> XML). Copy the code from here and paste in XML window and click on the Green tick mark on XML window you can see the process. To run the process, you just need to import your dataset into rapidminer and then replace the dataset in the process with the imported data. Also, you can change your output attribute in set role operator (Current label = Sex). Sample process image is provided below.<?xml version="1.0" encoding="UTF-8"?><process version="9.3.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.3.001" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="9.3.001" expanded="true" height="68" name="Retrieve Nebula" width="90" x="45" y="85">
<parameter key="repository_entry" value="//Local Repository/data/Nebula"/>
</operator>
<operator activated="true" class="impute_missing_values" compatibility="9.3.001" expanded="true" height="68" name="Impute Missing Values" width="90" x="246" y="85">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="iterate" value="true"/>
<parameter key="learn_on_complete_cases" value="true"/>
<parameter key="order" value="chronological"/>
<parameter key="sort" value="ascending"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<process expanded="true">
<operator activated="true" class="k_nn" compatibility="9.3.001" expanded="true" height="82" name="k-NN" width="90" x="179" y="85">
<parameter key="k" value="5"/>
<parameter key="weighted_vote" value="true"/>
<parameter key="measure_types" value="MixedMeasures"/>
<parameter key="mixed_measure" value="MixedEuclideanDistance"/>
<parameter key="nominal_measure" value="NominalDistance"/>
<parameter key="numerical_measure" value="EuclideanDistance"/>
<parameter key="divergence" value="GeneralizedIDivergence"/>
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="1.0"/>
<parameter key="kernel_sigma1" value="1.0"/>
<parameter key="kernel_sigma2" value="0.0"/>
<parameter key="kernel_sigma3" value="2.0"/>
<parameter key="kernel_degree" value="3.0"/>
<parameter key="kernel_shift" value="1.0"/>
<parameter key="kernel_a" value="1.0"/>
<parameter key="kernel_b" value="0.0"/>
</operator>
<connect from_port="example set source" to_op="k-NN" to_port="training set"/>
<connect from_op="k-NN" from_port="model" to_port="model sink"/>
<portSpacing port="source_example set source" spacing="0"/>
<portSpacing port="sink_model sink" spacing="0"/>
</process>
</operator>
<operator activated="true" class="set_role" compatibility="9.3.001" expanded="true" height="82" name="Set Role" width="90" x="380" y="85">
<parameter key="attribute_name" value="sex"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="concurrency:cross_validation" compatibility="9.3.001" expanded="true" height="145" name="Cross Validation" width="90" x="514" y="85">
<parameter key="split_on_batch_attribute" value="false"/>
<parameter key="leave_one_out" value="false"/>
<parameter key="number_of_folds" value="5"/>
<parameter key="sampling_type" value="automatic"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<parameter key="enable_parallel_execution" value="true"/>
<process expanded="true">
<operator activated="true" class="k_nn" compatibility="9.3.001" expanded="true" height="82" name="k-NN (2)" width="90" x="112" y="85">
<parameter key="k" value="5"/>
<parameter key="weighted_vote" value="true"/>
<parameter key="measure_types" value="MixedMeasures"/>
<parameter key="mixed_measure" value="MixedEuclideanDistance"/>
<parameter key="nominal_measure" value="NominalDistance"/>
<parameter key="numerical_measure" value="EuclideanDistance"/>
<parameter key="divergence" value="GeneralizedIDivergence"/>
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="1.0"/>
<parameter key="kernel_sigma1" value="1.0"/>
<parameter key="kernel_sigma2" value="0.0"/>
<parameter key="kernel_sigma3" value="2.0"/>
<parameter key="kernel_degree" value="3.0"/>
<parameter key="kernel_shift" value="1.0"/>
<parameter key="kernel_a" value="1.0"/>
<parameter key="kernel_b" value="0.0"/>
</operator>
<connect from_port="training set" to_op="k-NN (2)" to_port="training set"/>
<connect from_op="k-NN (2)" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="9.3.001" expanded="true" height="82" name="Apply Model" width="90" x="112" y="34">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance_classification" compatibility="9.3.001" expanded="true" height="82" name="Performance" width="90" x="246" y="34">
<parameter key="main_criterion" value="first"/>
<parameter key="accuracy" value="true"/>
<parameter key="classification_error" value="false"/>
<parameter key="kappa" value="true"/>
<parameter key="weighted_mean_recall" value="false"/>
<parameter key="weighted_mean_precision" value="false"/>
<parameter key="spearman_rho" value="false"/>
<parameter key="kendall_tau" value="false"/>
<parameter key="absolute_error" value="false"/>
<parameter key="relative_error" value="false"/>
<parameter key="relative_error_lenient" value="false"/>
<parameter key="relative_error_strict" value="false"/>
<parameter key="normalized_absolute_error" value="false"/>
<parameter key="root_mean_squared_error" value="false"/>
<parameter key="root_relative_squared_error" value="false"/>
<parameter key="squared_error" value="false"/>
<parameter key="correlation" value="false"/>
<parameter key="squared_correlation" value="false"/>
<parameter key="cross-entropy" value="false"/>
<parameter key="margin" value="false"/>
<parameter key="soft_margin_loss" value="false"/>
<parameter key="logistic_loss" value="false"/>
<parameter key="skip_undefined_labels" value="true"/>
<parameter key="use_example_weights" value="true"/>
<list key="class_weights"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="performance 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_test set results" spacing="0"/>
<portSpacing port="sink_performance 1" spacing="0"/>
<portSpacing port="sink_performance 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve Nebula" from_port="output" to_op="Impute Missing Values" to_port="example set in"/>
<connect from_op="Impute Missing Values" from_port="example set out" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Cross Validation" to_port="example set"/>
<connect from_op="Cross Validation" from_port="performance 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>Hope this helps.Regards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
6 -
varunm1 Member Posts: 1,207 UnicornHello @HIMANI
I know that you are in a rush to complete your work, we are not here on community all the time and also we stay in different time zones. We are also working or researching full time, please be patient, there are many good people here who will help you, but it takes time.
Thanks.Regards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
5
Answers
We will help you solve issues but we do not do any assignments. This is our policy. There are tutorials in rapidminer that will help you understand the process. Go to rapidminer academy and check for relevant videos. If you have any issues in your process then we can help solve an issue.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing