The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Data Preprocessing Ideas
I am working with a dataset that is relatively clean, it has no missing values and most of the attributes are numeric with one being a date-time stamp of every 30 mins. I need to carry out some pre-processing techniques on it and have the below ideas but am also looking for other suggestions. Thanks.
- Rename some of the numeric attributes so they are easier to identify
- Set roles
Ultimately I will build a model to predict the temperature using regression models and the date-time stamp. This will be trained and then tested.
Tagged:
0
Answers
Perhaps windowing the time data. Or having a column to show if the numeric value is higher or lower than the value 30 minutes previously?
Hi Edward,
Thanks for the feedback. I am pretty new to RM. Can you explain a little more on how windowing works? Does the time-date attribute need to have the role of label? Thanks.
Hi Sammie,
first of all you need the time series extension. You can find it in the marketplace (in the menu Extensions -> Marketplace). Try to experiment with the operators and their tutorials.
I think that your question is more about Feature Generation than about RapidMiner. You will probably need to consult some Time Series literature. I can recommend the following:
Helmut Lütkepohl-New Introduction To Multiple Time Series Analysis-Springer (2006)
Kind regards,
Sebastian