Finding Peak Times in a timeseries dataset

pix123 · April 2018

Hi there,

I am working with a dataseries that has a date-time stamp in one column. I am looking for a way to identify what are the peak times over the duration of the collected date-time stamps, is there a way to handle this in Rapidminer? If further details are needed, please let me know. Thanks.

Telcontar120 · April 2018

How do you define "peak" for this purpose? Finding a single maximum in a series is easily done using a number of different operators. But finding "peaks" might imply some kind of underlying periodic function or a variable definition of what exactly constitutes a peak. That kind of analysis is a bit trickier---you might want to check out the Series extension from the marketplace and look at some of the operators in there.

pix123 · April 2018

Thanks for the quick reply. By peak I am referring to the time of a given day that is the highest. I am trying to determine at what times of the day usage is highest , the time has been recorded in 30 minute intervals over a 140 day period. I hope this clarifies. Is there a particular operator in the time series extension package you would recommend?

Telcontar120 · April 2018

It sounds like you have many separate days worth of data, so if you are looking for patterns, you can simply aggregate by time of day (if you have 30 minute intervals then you should have 48 data points per day) and then calculate the average and variance of each one---this will give you a sense of which times are more likely to be higher than others. You can also get the minimum and maximum for each time of day to see how that compares to the average.

However, if you are looking to identify the specific time slot on each individual day that corresponds to the maximum value for that day, the process is going to be more complex---you'll have to aggregate by each day to calculate the maximum by day, and then identify which particular timeslot matches that value.

Neither of these processes would require the series extension, by the way. That's more useful if you are trying to do things like calculate moving averages, do smoothing of series data, or any time series forecasting such as ARIMA.

sgenzer · April 2018

you can also use "Generate Attributes" and create a new attribute that "gets" the hour of the timestamp. Then you can cleanly aggregate, etc...

Screen Shot 2018-04-24 at 1.49.34 PM.png

Scott

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Finding Peak Times in a timeseries dataset

Answers