The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

piping in Execute R

MahdiPMahdiP Member Posts: 9 Contributor II
edited November 2018 in Help

Hello everybody,

I am trying to make use of the dplyr package for data manipulation in Execute R in Rapidminer. Every thing almost goes fine except the piping procedure!

I have this chunk of code that does not give rise to any output dataset and I get "Memory buffered file" as the output message. I check the Log file and every thing seems to be working flawlessly!  The code reads:

 

rm_main = function(data)
{
library(datasets)
library(dplyr)

NewData <- (
data %>%
mutate (Product2= Tot_Product*2) %>%
filter(Life_Phase != "Family") %>%
group_by(City, P_PLUS, Life_Phase) %>%
summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
)
return(NewData)
}

This code actually works in RStudion.

I apprecitae your help.

Mahdi

Tagged:

Best Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    To build on what @mschmitz said, the Execute R operator allows you plass R objects to another Execute R operator but to output the results into RapidMiner, you have to first convert it to a dataframe inside your function.

  • MahdiPMahdiP Member Posts: 9 Contributor II
    Solution Accepted

    Thank you so much for such a quick reply!

    It sounds like at the stage of applying group_by function, it messes up and destroy creation of the datafram. it could be simply solved by rendering the final result as dataframe to Rapidminer like; 


    rm_main = function(data)
    {
    library(datasets)
    library(dplyr)
    NewData <-
    data.frame(
    data %>%
    mutate (Product2= Tot_Product*2) %>%
    filter(Life_Phase != "Family") %>%
    group_by(City, P_PLUS, Life_Phase) %>%
    summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
    select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
    )
    return(NewData)
    }

     

    Now it outputs a dataset to Rapidminer. 

    Thank you so much again.

    Mahdi

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist

    Dear Mahdi,

     

    RapidMiner can not interprete all types you can do in R. If you sent back something which is not interpretable it gets into RM as a memory buffered file. This cannot be used in RM but piped back into another R operator to be used there. This is very useful e.g. for modelling.

     

    In your case your NewData seems not to be a R dataframe.

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.