Use previous value to calculate next value
Hi,
I'm trying to perform a calculation that is very easy to do in Excel, but I can't find a way to do in RapidMiner. I expected this to be a little tricky in RM, but also thought this would be a common question on the forum. I'm still new to RM, so it's very possible that I have overlooked something.
Basically I need the previous value of an attribute to calculate the next.
Here's an example of a similar calculation in Excel. To calculate B2:
B2 = (B1*2 + A2) / 10
To calculate B3:
B3 = (B2*2 + A3) / 10
And so on. So I'm using the current value of A, and the previous value of B to get the value of B.
OK. I tried the lag operator, but it can't deliver the previous value of 'itself'. Do I need to use a loop operator? If someone could help me with an example I'd be very happy.
Answers
Ok, I think I'm a little closer to the solution, but I'm still stuck.
It seems like it can be done with a combination of Extract Macro and Loop Examples.
However I can't find a way to set the example index to be the previous value.
That's not hard to do at all, you were on the right track with the Lag operator. See below:
That code is very clever, but I actually need to use the previous value of the Math-attribute.
So..
Say we have the a1 attribute and the Math-attribute from the example.
The math-attribute should use the current value of a1, and the previous value of itself.
Something like this.
Row no. a1 Math
1 1.5
2 1.2 (prev. value of Math * 9 + 1.2) / 10
3 1.6 (prev. value of Math * 9 + 1.6) / 10
4 1.3 (prev. value of Math * 9 + 1.3) / 10
I can't do that with the lag operator, right?
Any clue, @Thomas_Ott? I'm unable to move forward with my grand plan.
Hey,
Lag as Tom proposed in combination with Generate Attributes should do the trick.
~Martin
Dortmund, Germany
I have tried this, but it doesn't solve my problem. The lag operator can give me the previous value of att1, but that's not what I'm looking for here. I need to create a new attribute that takes the current value of att1, and the previous value of itself to get the current value.
Again - in excel - to calculate cell B2, i would use A2 and B1. Cell B3 requires A3 and B2 values. And so on. Lagging either attributes doesn't do the trick here (or at least not without some additional logic)
Hi,
exactly this is what you can do with Lag+Generate Attributes. See attached process.
Best,
Martin
Dortmund, Germany
I haven't been able to express myself clearly about this and I'm sorry about this. Normally your solution would work, but it doesn't here and the reason is I need to perform some calculations in the new attribute.
First let's start with this data set.
Row Id Gain
1 5
2 0
3 3
4 1
5 0
6 2
..
Now. We need to generate a new attribute. Let's just label it calc1.
calc1 is calculated like this:
( (Previous Value of calc1 * 13) + Current value of 'Gain') / 14
Our new data set then looks like this (calculations explained on right side)
Row Id Gain Calc1
1 5 0,357 <- (0*13+5)/14
2 0 0,331 <- (0,357*13+0)/14
3 3 0,562 <- (0,331*13+3)/14
4 1 0,593 <- (0,562*13+1)/14
5 0 0,551 <- (0,593*13+0)/14
6 2 0,654 <- (0,551*13+2)/14
..
As you can see this is different because the new attribute we are creating really needs to access the previous value of itself, to create the next. Because of the calculation we are performing, lagging is not appropriate (whereas if you simply needed to add current and previous values together, lagging would be perfect).
I haven't been able to express myself clearly about this and I'm sorry about this. Normally your solution would work, but it doesn't here and the reason is I need to perform some calculations in the new attribute.
First let's start with this data set.
Row Id Gain
1 5
2 0
3 3
4 1
5 0
..
Now. We need to generate a new attribute. Let's just label it calc1.
calc1 is calculated like this:
( (Previous Value of calc1 * 13) + Current value of 'Gain') / 14
Our new data set then looks like this (calculations explained on right side)
Row Id Gain Calc1
1 5 0,357 <- (0*13+5)/14
2 0 0,331 <- (0,357*13+0)/14
3 3 0,562 <- (0,331*13+3)/14
4 1 0,593 <- (0,562*13+1)/14
5 0 0,551 <- (0,593*13+0)/14
..
As you can see this is different because the new attribute we are creating really needs to access the previous value of itself, to create the next. Because of the calculation we are performing, lagging is not appropriate (whereas if you simply needed to add current and previous values together, lagging would be perfect).
Hey,
have you found a solution to this?
Hi @amund,
I think I managed to do what you want:
Values depend on the first value (row 1) of att2, the others are actually replaced. I hope it helps!
Best regards,
Sebastian