The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
K-NN
fedayncarica
Member Posts: 30 Contributor I
Good morning everybody. I'm an Italian Student and i have a problem with my process. How can I solve it? I'm desperated.. I attached my dataset.
Have a good day,
best regards,
Damian
<?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
<operator activated="true" class="concurrency:loop_files" compatibility="7.4.000" expanded="true" height="82" name="Loop Files" width="90" x="313" y="34">
<parameter key="directory" value="C:\Users\Damiano\Desktop\csv"/>
<parameter key="filter_type" value="regex"/>
<parameter key="filter_by_regex" value=".*csv*."/>
<parameter key="recursive" value="false"/>
<parameter key="enable_macros" value="false"/>
<parameter key="macro_for_file_name" value="file_name"/>
<parameter key="macro_for_file_type" value="file_type"/>
<parameter key="macro_for_folder_name" value="folder_name"/>
<parameter key="reuse_results" value="true"/>
<parameter key="enable_parallel_execution" value="true"/>
<process expanded="true">
<operator activated="true" class="read_csv" compatibility="7.4.000" expanded="true" height="68" name="Read CSV" width="90" x="45" y="34">
<parameter key="csv_file" value="C:\Users\Damiano\Desktop\csv\5000IstanzeOM.csv"/>
<parameter key="column_separators" value=";"/>
<parameter key="trim_lines" value="false"/>
<parameter key="use_quotes" value="true"/>
<parameter key="quotes_character" value="""/>
<parameter key="escape_character" value="\"/>
<parameter key="skip_comments" value="false"/>
<parameter key="comment_characters" value="#"/>
<parameter key="parse_numbers" value="true"/>
<parameter key="decimal_character" value="."/>
<parameter key="grouped_digits" value="false"/>
<parameter key="grouping_character" value=","/>
<parameter key="date_format" value=""/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<parameter key="time_zone" value="SYSTEM"/>
<parameter key="locale" value="English (United States)"/>
<parameter key="encoding" value="windows-1252"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="graphsNumber,15-8016,16-10377,14-2797,15-1317,14-435,21-2,22-1,21-1,22-0,15-4855,label.true.polynominal.attribute"/>
</list>
<parameter key="read_not_matching_values_as_missings" value="true"/>
<parameter key="datamanagement" value="double_array"/>
<parameter key="data_management" value="auto"/>
</operator>
<operator activated="true" class="remap_binominals" compatibility="7.4.000" expanded="true" height="82" name="Remap Binominals" width="90" x="179" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="binominal"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="binominal"/>
<parameter key="block_type" value="value_matrix_start"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="0"/>
<parameter key="positive_value" value="1"/>
</operator>
<operator activated="true" class="numerical_to_binominal" compatibility="7.4.000" expanded="true" height="82" name="Numerical to Binominal" width="90" x="313" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="0.0"/>
</operator>
<operator activated="true" class="loop_parameters" compatibility="7.4.000" expanded="true" height="103" name="Loop Parameters" width="90" x="514" y="136">
<list key="parameters">
<parameter key="k-NN.k" value="[1.0;100.0;10;linear]"/>
</list>
<parameter key="error_handling" value="fail on error"/>
<parameter key="synchronize" value="false"/>
<process expanded="true">
<operator activated="true" class="set_role" compatibility="7.4.000" expanded="true" height="82" name="Set Role" width="90" x="112" y="34">
<parameter key="attribute_name" value="graphsNumber,15-8016,16-10377,14-2797,15-1317,14-435,21-2,22-1,21-1,22-0,15-4855,label"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="x_validation" compatibility="7.4.000" expanded="true" height="145" name="Validation" width="90" x="380" y="34">
<parameter key="create_complete_model" value="false"/>
<parameter key="average_performances_only" value="true"/>
<parameter key="leave_one_out" value="false"/>
<parameter key="number_of_validations" value="5"/>
<parameter key="sampling_type" value="stratified sampling"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<process expanded="true">
<operator activated="true" class="k_nn" compatibility="7.4.000" expanded="true" height="82" name="k-NN" width="90" x="45" y="34">
<parameter key="k" value="1"/>
<parameter key="weighted_vote" value="false"/>
<parameter key="measure_types" value="MixedMeasures"/>
<parameter key="mixed_measure" value="MixedEuclideanDistance"/>
<parameter key="nominal_measure" value="NominalDistance"/>
<parameter key="numerical_measure" value="EuclideanDistance"/>
<parameter key="divergence" value="GeneralizedIDivergence"/>
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="1.0"/>
<parameter key="kernel_sigma1" value="1.0"/>
<parameter key="kernel_sigma2" value="0.0"/>
<parameter key="kernel_sigma3" value="2.0"/>
<parameter key="kernel_degree" value="3.0"/>
<parameter key="kernel_shift" value="1.0"/>
<parameter key="kernel_a" value="1.0"/>
<parameter key="kernel_b" value="0.0"/>
</operator>
<connect from_port="training" to_op="k-NN" to_port="training set"/>
<connect from_op="k-NN" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
<parameter key="use_example_weights" value="true"/>
</operator>
<operator activated="true" class="performance_to_data" compatibility="7.4.000" expanded="true" height="82" name="Performance to Data" width="90" x="45" y="165"/>
<operator activated="true" class="write_csv" compatibility="7.4.000" expanded="true" height="82" name="Write CSV" width="90" x="179" y="238">
<parameter key="csv_file" value="C:\Users\Damiano\Desktop\performance_knn.csv"/>
<parameter key="column_separator" value=";"/>
<parameter key="write_attribute_names" value="true"/>
<parameter key="quote_nominal_values" value="true"/>
<parameter key="format_date_attributes" value="true"/>
<parameter key="append_to_file" value="true"/>
<parameter key="encoding" value="SYSTEM"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_op="Performance to Data" to_port="performance vector"/>
<connect from_op="Performance to Data" from_port="example set" to_op="Write CSV" to_port="input"/>
<connect from_op="Performance to Data" from_port="performance vector" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
<portSpacing port="sink_averagable 3" spacing="0"/>
</process>
</operator>
<connect from_port="input 1" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 2" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
<connect from_port="file object" to_op="Read CSV" to_port="file"/>
<connect from_op="Read CSV" from_port="output" to_op="Remap Binominals" to_port="example set input"/>
<connect from_op="Remap Binominals" from_port="example set output" to_op="Numerical to Binominal" to_port="example set input"/>
<connect from_op="Numerical to Binominal" from_port="example set output" to_op="Loop Parameters" to_port="input 1"/>
<connect from_op="Loop Parameters" from_port="result 1" to_port="output 1"/>
<portSpacing port="source_file object" spacing="0"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
</process>
0
Best Answer
-
Marco_Boeck Administrator, Moderator, Employee-RapidMiner, Member, University Professor Posts: 1,996 RM Engineering
Hi,
Two problems:
- Your CSV file contains "," as the separator, but in the "Read CSV" operator, you define ";" as the separator. Thus your data only has one column because it does not split on each comma. Change the "column separators" parameter of the "Read CSV" operator to "," and it will read it correctly
- You still need to tell RapidMiner Studio that the attribute called "label" is actually a label column. Use the "Set Role" operator for that which you can add directly before the "Loop Parameters" operator. Type in "label" for the "attribute name" parameter and set the "target role" parameter to "label" as well.
Regards,
Marco
0
Answers
The XML code you pasted is not importing for me. Can you use export the RMP? Just go to File > Export Process, and attach that.
Also, please don't create new threads if you already started one on this particular topic.
Hi Thomas, ok.. i send you my rmp!
What is your label? Did you use the Read CSV import wizard to load your data in?
Good Morning, yes, i used the Read Csv import wizard..
Hi Marco and thank you very very much for your support. I have change the process with your adds. I have another problem.. I will attache another picture for you! Thank you!
Hi,
as the title of that message implies, it's a possible problem. Because this is inside a Loop Files operator and after a Read CSV operator, we just don't know what we will get. The assumption is an empty data set, therefore this warning is displayed. You can safely ignore that, unless you have CSV files which are empty
Regards,
Marco
Thank you Marco! Are you italian?