piping in Execute R

MahdiP · July 2016

Hello everybody,

I am trying to make use of the dplyr package for data manipulation in Execute R in Rapidminer. Every thing almost goes fine except the piping procedure!

I have this chunk of code that does not give rise to any output dataset and I get "Memory buffered file" as the output message. I check the Log file and every thing seems to be working flawlessly! The code reads:

rm_main = function(data)
{
library(datasets)
library(dplyr)

NewData <- (
data %>%
mutate (Product2= Tot_Product*2) %>%
filter(Life_Phase != "Family") %>%
group_by(City, P_PLUS, Life_Phase) %>%
summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
)
return(NewData)
}

This code actually works in RStudion.

I apprecitae your help.

Mahdi

Thomas_Ott · July 2016

To build on what @mschmitz said, the Execute R operator allows you plass R objects to another Execute R operator but to output the results into RapidMiner, you have to first convert it to a dataframe inside your function.

MahdiP · July 2016

Thank you so much for such a quick reply!

It sounds like at the stage of applying group_by function, it messes up and destroy creation of the datafram. it could be simply solved by rendering the final result as dataframe to Rapidminer like;

rm_main = function(data)
{
library(datasets)
library(dplyr)
NewData <-
data.frame(
data %>%
mutate (Product2= Tot_Product*2) %>%
filter(Life_Phase != "Family") %>%
group_by(City, P_PLUS, Life_Phase) %>%
summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
)
return(NewData)
}

Now it outputs a dataset to Rapidminer.

Thank you so much again.

Mahdi

MartinLiebig · July 2016

Dear Mahdi,

RapidMiner can not interprete all types you can do in R. If you sent back something which is not interpretable it gets into RM as a memory buffered file. This cannot be used in RM but piped back into another R operator to be used there. This is very useful e.g. for modelling.

In your case your NewData seems not to be a R dataframe.

~Martin

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

piping in Execute R

Best Answers

Answers