"SVM model results : display bug in charts ?"

lionelderkrikor · December 2017

Hi,

I'm doing some experimentations in Rapidminer and it seems that I discovered a bug :

I created a simply model using the "SVM" operator.

I run the process and I'm going to the results windows -> "Kernel Model (SVM) -> Charts :

Then I choose chart style = "Scatter" (but maybe some other chart styles are concerned by this bug) : It's impossible to display x1 (my first attribute) on x-axis and x2 (my second attribute) on y-axis and vice-versa.

Here a screenshot of the charts window :

The other physical quantities (counter, label, function value etc.) are good displayed.

My training dataset (04_Class_4.6_SVM_simple_example.csv) and my score dataset (score_test_SVM.csv)

are in attached files.

You can find my process here :

<?xml version="1.0" encoding="UTF-8"?><process version="7.6.002">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.6.002" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="read_csv" compatibility="7.6.002" expanded="true" height="68" name="Read_TrainSet" width="90" x="45" y="85">
        <parameter key="csv_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Predictive_Analytics_and_Data_Mining\Dec 15 2014\04_Class_4.6_SVM_simple_example.csv"/>
        <parameter key="column_separators" value="\s+"/>
        <list key="annotations"/>
        <list key="data_set_meta_data_information"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.6.002" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
        <parameter key="attribute_name" value="class"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="support_vector_machine" compatibility="7.6.002" expanded="true" height="124" name="SVM" width="90" x="313" y="34">
        <parameter key="kernel_type" value="polynomial"/>
        <parameter key="kernel_degree" value="1.0"/>
        <parameter key="C" value="1.0"/>
        <parameter key="convergence_epsilon" value="1.0E-5"/>
        <parameter key="max_iterations" value="10000000"/>
        <parameter key="scale" value="false"/>
      </operator>
      <operator activated="false" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="82" name="Read_TrainSet (2)" width="90" x="45" y="340">
        <parameter key="script" value="import pandas as pd&#10;&#10;# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;def rm_main():&#10;&#10;    path = 'C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Predictive_Analytics_and_Data_Mining\Dec 15 2014'&#10;    data = pd.read_csv(path + '/04_Class_4.6_SVM_simple_example.csv',sep ='\s+')&#10;&#10;    # connect 2 output ports to see the results&#10;    return data"/>
      </operator>
      <operator activated="false" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="103" name="Build SVM Python" width="90" x="179" y="340">
        <parameter key="script" value="import pandas as pd&#10;import numpy as np&#10;from sklearn.svm import SVC&#10;from sklearn.calibration import CalibratedClassifierCV&#10;&#10;# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;def rm_main(train):&#10;&#10;    X = train.iloc[:,0:2]&#10;    y = train.iloc[:,2]&#10;    x1 = train.iloc[:,0]&#10;    x2 = train.iloc[:,1]&#10;&#10;    model = SVC(kernel = 'linear', probability = True,degree = 1,tol = 1e-5,random_state = 1992 )&#10;    #model_calibre = CalibratedClassifierCV(model)&#10;    model_calibre = CalibratedClassifierCV(model,method = 'isotonic')&#10;    model.fit(X,y)&#10;    model_calibre.fit(X,y)&#10;    &#10;    [[w1,w2]] = model.coef_&#10;    [w0] = model.intercept_&#10;&#10;    support = model.support_&#10;    [dual_coef] = model.dual_coef_&#10;    decfunction = model.decision_function(X)&#10;&#10;    support = pd.DataFrame(data =support,columns = ['support']) &#10;    alpha= pd.DataFrame(data = dual_coef,columns = ['alpha'])&#10;    abs_alpha  = pd.DataFrame(data = np.absolute(dual_coef),columns = ['abs(alpha)'])&#10;    alpha = alpha.join(abs_alpha,how = 'left')&#10;    alpha = alpha.join(support,how = 'left')&#10;    alpha = alpha.set_index('support')&#10;&#10;    dec_func = pd.DataFrame(data = decfunction,columns = ['decision function'])&#10;    dec_func = dec_func.join(y)&#10;    dec_func = dec_func.join([x1,x2],how = 'outer')&#10;    &#10;    dec_func =pd.concat([dec_func,alpha], axis = 1)&#10;    &#10;    weight = pd.DataFrame(data = [[w0,w1,w2]],columns = ['w0','w1','w2']) &#10;    weight = pd.concat([weight,dec_func])&#10;      &#10;    #weight.rm_metadata['w0']=(None,'w0')&#10;    #weight.rm_metadata['w1']=(None,'w1')&#10;    #weight.rm_metadata['w2']=(None,'w2')&#10;    #weight.rm_metadata['decision function']=(None,'decision function')&#10;    #weight.rm_metadata['label']=(None,'label')&#10;    &#10;&#10;    # connect 2 output ports to see the results&#10;    return weight,model,model_calibre"/>
      </operator>
      <operator activated="true" class="read_csv" compatibility="7.6.002" expanded="true" height="68" name="Read_ScoreSet" width="90" x="313" y="187">
        <parameter key="csv_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Predictive_Analytics_and_Data_Mining\Dec 15 2014\score_test_SVM.csv"/>
        <parameter key="column_separators" value="\s+"/>
        <list key="annotations"/>
        <list key="data_set_meta_data_information"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="7.6.002" expanded="true" height="82" name="Apply Model" width="90" x="447" y="136">
        <list key="application_parameters"/>
      </operator>
      <operator activated="false" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="82" name="Read_ScoreSet (2)" width="90" x="179" y="493">
        <parameter key="script" value="import pandas as pd&#10;&#10;# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;def rm_main():&#10;&#10;    path = 'C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Predictive_Analytics_and_Data_Mining\Dec 15 2014'&#10;    data = pd.read_csv(path + '/score_test_SVM.csv',sep ='\s+')&#10;&#10;    # connect 2 output ports to see the results&#10;    return data"/>
      </operator>
      <operator activated="false" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="124" name="Apply Model Python" width="90" x="447" y="391">
        <parameter key="script" value="import pandas as pd&#10;from sklearn.svm import SVC&#10;&#10;# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;def rm_main(model,score, model_calibre):&#10;&#10;    X = score.iloc[:,0:2]&#10;   &#10;    pred = model.predict(X)&#10;    #conf = model.predict_proba(X)&#10;    conf = model_calibre.predict_proba(X)&#10;    dec_function = model.decision_function(X)&#10;&#10;    score['prediction (class)'] = pred&#10;    score['confidence(A)'] = conf[:,0]&#10;    score['confidence(B)'] = conf[:,1]&#10;    score['decision function'] = dec_function&#10;&#10;    score.rm_metadata['prediction (class)']=(None,'prediction (class)')&#10;    score.rm_metadata['confidence(A)']=(None,'confidence(A)')&#10;    score.rm_metadata['confidence(B)']=(None,'confidence(B)')&#10;    score.rm_metadata['decision function']=(None,'decision function')&#10;    &#10;    # connect 2 output ports to see the results&#10;    return score"/>
      </operator>
      <connect from_op="Read_TrainSet" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="SVM" to_port="training set"/>
      <connect from_op="SVM" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Read_TrainSet (2)" from_port="output 1" to_op="Build SVM Python" to_port="input 1"/>
      <connect from_op="Build SVM Python" from_port="output 1" to_op="Apply Model Python" to_port="input 1"/>
      <connect from_op="Build SVM Python" from_port="output 2" to_op="Apply Model Python" to_port="input 3"/>
      <connect from_op="Read_ScoreSet" from_port="output" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 2"/>
      <connect from_op="Apply Model" from_port="model" to_port="result 1"/>
      <connect from_op="Read_ScoreSet (2)" from_port="output 1" to_op="Apply Model Python" to_port="input 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
      <portSpacing port="sink_result 5" spacing="0"/>
    </process>
  </operator>
</process>

Thanks you for your explanations,

Regards,

Lionel

sgenzer · March 2018

bug in scatter plot function confirmed. Pushing to dev team.

SG

sgenzer · April 2018

fixed and scheduled for release.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"SVM model results : display bug in charts ?"

Fixed and Released · Last Updated March 2019

Comments