Grouping profit and loss by time of day, putting in difference in valuation from period to period
Hi guys, need some help here.
I have data that is organised at 1 second intervals, for each second it keeps track of the user and the product that was traded and the value of that position.
i have multiple users and each user can trade various products.
how do i,
1. classify users into usergroups?
2. obtain the change in each user-product valuation
2. group that resultant valuation into customisable blocks of time?
sample data set:
timestamp|user|product|valuation
0:01|frank|seashells|$1
0:01|frank2|cockels|-$4
0:01|melissa|cockels|-$5
0:02|frank|seashells|$1
0:02|frank2|cockels|-$4
0:02|melissa|cockels|-$6
0:03|frank|seashells|$4
0:03|frank2|cockels|-$3
0:03|melissa|cockels|-$4
0:04|frank|seashells|$7
0:04|frank2|cockels|-$1
0:04|melissa|cockels|-$2
frank and frank2 would be grouped together as frankie
the first 2 seconds is 1 block.
the next 2 seconds is another block.
so i can see... in the first 2 seconds. frankie made 0 , melissa made -1
in the next 2 second block. frank seashells made 6, frank cockels made 3, melissa 4.
and also group the product together. so seashells/cockels would be grouped as seafood.
so seafood made -$1 in first 2 seconds.
and seafood made $10 in next 2 seconds.
thanks guys!
Answers
Hi blancharde,
i think the way to go is first to use a map or replace operator to unify the name and afterwards one aggregate with sum of value and group_by product timestamp and user.
~Martin
Dortmund, Germany
Hi Martin, thanks for your help.
What about finding the difference from one timestamp to another, between product timestamp and user,
can I create a new column called value_difference ?