The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

R code works in the old extension and fails in the new one [Solved]

ammarghammargh Member Posts: 27 Maven
edited June 2019 in Help
Using the old R extension the following code works perfectly using the old "Execute Scripts"  component

library(unbalanced)
res<-ubSmoteExs(data, "Diagnosis",1500)

However, using the new "Execute R" component the same code returns an error (Invalid labels ,length zero should be 1 or 2)

The code used in the "Execute R" component is:

rm_main = function(data)
{
library(unbalanced)
       res<-ubSmoteExs(data, "Diagnosis",1500)
return(res)
}

Would you please help me
Tagged:

Answers

  • David_ADavid_A Administrator, Moderator, Employee-RapidMiner, RMResearcher, Member Posts: 297 RM Research
    Hi,

    unfortunately I cannot reproduce your error.
    I tried it with the iris data set and it works fine (see example below)
    I'm not familiar with the unbalanced package and the format of your data, but I suspect that your result somehow cannot be converted in a data.frame.

    rm_main = function()
    {
      library("unbalanced")
            res<-ubSmoteExs(iris, "Species",1500)
      return(res)
    }

    Best,
    David
  • ammarghammargh Member Posts: 27 Maven
    Thank you very much for your efforts. However, you are calling iris from within R.

    This is my code using sonar data.  If you have the old R extension you can enable the R script component and see that it works.

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
     <context>
       <input>
       </input>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Sonar" width="90" x="112" y="120">
           <parameter key="repository_entry" value="//Samples/data/Sonar"/>
         </operator>
         <operator activated="true" class="x_validation" compatibility="6.4.000" expanded="true" height="112" name="Validation (5)" width="90" x="916" y="30">
           <parameter key="sampling_type" value="2"/>
           <parameter key="use_local_random_seed" value="true"/>
           <process expanded="true">
             <operator activated="true" class="subprocess" compatibility="6.4.000" expanded="true" height="76" name="Smote R (3)" width="90" x="112" y="120">
               <process expanded="true">
                 <operator activated="true" class="filter_examples" compatibility="6.4.000" expanded="true" height="94" name="Filter Examples (5)" width="90" x="246" y="165">
                   <list key="filters_list">
                     <parameter key="filters_entry_key" value="class.equals.Mine"/>
                   </list>
                 </operator>
                 <operator activated="true" class="r_scripting:execute_r" compatibility="6.4.000" expanded="true" height="76" name="Execute R (3)" width="90" x="581" y="120">
                   <parameter key="script" value="# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;rm_main = function(data)&#10;{&#10;     library(unbalanced)&#10;     res&lt;-ubSmoteExs(data, &quot;class&quot;,1500)&#10;&#9;return(res)&#10;}&#10;"/>
                 </operator>
                 <operator activated="false" class="r:execute_script_r" compatibility="5.3.000" expanded="true" height="60" name="Execute Script (3)" width="90" x="581" y="30">
                   <parameter key="script" value="library(unbalanced)&#10;res&lt;-ubSmoteExs(data, &quot;class&quot;,1500)"/>
                   <enumeration key="inputs">
                     <parameter key="name_of_variable" value="data"/>
                   </enumeration>
                   <list key="results">
                     <parameter key="res" value="Data Table"/>
                   </list>
                 </operator>
                 <operator activated="true" class="set_role" compatibility="6.4.000" expanded="true" height="76" name="Set Role (4)" width="90" x="782" y="120">
                   <parameter key="attribute_name" value="class"/>
                   <parameter key="target_role" value="label"/>
                   <list key="set_additional_roles"/>
                 </operator>
                 <operator activated="true" class="append" compatibility="6.4.000" expanded="true" height="94" name="Append (3)" width="90" x="782" y="300"/>
                 <operator activated="true" class="shuffle" compatibility="6.4.000" expanded="true" height="76" name="Shuffle (3)" width="90" x="983" y="300"/>
                 <connect from_port="in 1" to_op="Filter Examples (5)" to_port="example set input"/>
                 <connect from_op="Filter Examples (5)" from_port="example set output" to_op="Execute R (3)" to_port="input 1"/>
                 <connect from_op="Filter Examples (5)" from_port="unmatched example set" to_op="Append (3)" to_port="example set 2"/>
                 <connect from_op="Execute R (3)" from_port="output 1" to_op="Set Role (4)" to_port="example set input"/>
                 <connect from_op="Set Role (4)" from_port="example set output" to_op="Append (3)" to_port="example set 1"/>
                 <connect from_op="Append (3)" from_port="merged set" to_op="Shuffle (3)" to_port="example set input"/>
                 <connect from_op="Shuffle (3)" from_port="example set output" to_port="out 1"/>
                 <portSpacing port="source_in 1" spacing="0"/>
                 <portSpacing port="source_in 2" spacing="0"/>
                 <portSpacing port="sink_out 1" spacing="0"/>
                 <portSpacing port="sink_out 2" spacing="0"/>
               </process>
             </operator>
             <operator activated="true" class="parallel_decision_tree" compatibility="6.4.000" expanded="true" height="76" name="Decision Tree (4)" width="90" x="380" y="120">
               <parameter key="maximal_depth" value="4"/>
               <parameter key="apply_pruning" value="false"/>
               <parameter key="apply_prepruning" value="false"/>
             </operator>
             <connect from_port="training" to_op="Smote R (3)" to_port="in 1"/>
             <connect from_op="Smote R (3)" from_port="out 1" to_op="Decision Tree (4)" to_port="training set"/>
             <connect from_op="Decision Tree (4)" from_port="model" to_port="model"/>
             <portSpacing port="source_training" spacing="0"/>
             <portSpacing port="sink_model" spacing="0"/>
             <portSpacing port="sink_through 1" spacing="0"/>
           </process>
           <process expanded="true">
             <operator activated="true" class="apply_model" compatibility="6.4.000" expanded="true" height="76" name="Apply Model (5)" width="90" x="45" y="30">
               <list key="application_parameters"/>
             </operator>
             <operator activated="true" class="performance_binominal_classification" compatibility="6.4.000" expanded="true" height="76" name="Performance (4)" width="90" x="246" y="30">
               <parameter key="AUC" value="true"/>
               <parameter key="sensitivity" value="true"/>
               <parameter key="specificity" value="true"/>
             </operator>
             <connect from_port="model" to_op="Apply Model (5)" to_port="model"/>
             <connect from_port="test set" to_op="Apply Model (5)" to_port="unlabelled data"/>
             <connect from_op="Apply Model (5)" from_port="labelled data" to_op="Performance (4)" to_port="labelled data"/>
             <connect from_op="Performance (4)" from_port="performance" to_port="averagable 1"/>
             <portSpacing port="source_model" spacing="0"/>
             <portSpacing port="source_test set" spacing="0"/>
             <portSpacing port="source_through 1" spacing="0"/>
             <portSpacing port="sink_averagable 1" spacing="0"/>
             <portSpacing port="sink_averagable 2" spacing="0"/>
           </process>
         </operator>
         <connect from_op="Retrieve Sonar" from_port="output" to_op="Validation (5)" to_port="training"/>
         <connect from_op="Validation (5)" from_port="model" to_port="result 1"/>
         <connect from_op="Validation (5)" from_port="training" to_port="result 2"/>
         <connect from_op="Validation (5)" from_port="averagable 1" to_port="result 3"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
         <portSpacing port="sink_result 3" spacing="0"/>
         <portSpacing port="sink_result 4" spacing="0"/>
       </process>
     </operator>
    </process>




  • David_ADavid_A Administrator, Moderator, Employee-RapidMiner, RMResearcher, Member Posts: 297 RM Research
    Hi,

    unfortunately I still cannot reproduce your error.
    On my system your example process runs fine with the new R extension and the sonar data set.

    Best,
    David
  • ammarghammargh Member Posts: 27 Maven
    Thank you very much

    Do think it has something to do with the used R version?

    I am using R 3.2.1

    By the way I having the same problem in windows and mac machines
  • David_ADavid_A Administrator, Moderator, Employee-RapidMiner, RMResearcher, Member Posts: 297 RM Research
    Hi,

    I also use 3.2.1, so this should not be the error.
    Your error messages indicates, that somehow the labels and levels of result data set do not match.
    So my guess is, that the result of the ubSmoteExs() function cannot be correctly transformed in a data.frame.
    A solution might be to use factor() on your label to remove unused levels. See here http://stackoverflow.com/questions/6506239/r-warning-mistake-in-factor for an explanation.

    Regards,
    David
  • ammarghammargh Member Posts: 27 Maven
    Thank you very much.

    I don't think this is the reason because the same code works perfectly using the old R extension, and you have  mentioned that the sonar data code I have sent is working at your machine so I think there is a different reason for this error.

    It might be my R configuration, I will check it

    Thanks again
  • ammarghammargh Member Posts: 27 Maven
    I found that in the old extension, the class of the label attribute is factor, while in the new one the class of the label attribute is character.

    Converting it into factor solved the problem


    Thank you
Sign In or Register to comment.