The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
First day of each week
Hello,
I want to do a time series forecast in data that I wish to group by week and I use PIVOT to do that. The problem is that then I lose the date type and I don't want it. Before Rapidminer I use to use Excel to transform year and week numbers in date (as first day of each week). By doing this I get a monotonic time series.
In Excel I used to calculate the first day of each week like below, but I want to perform all data prep in RM and honestly I'm getting nowhere.
=DATE([@YEAR],1,-2)-WEEKDAY(DATE([@YEAR],1,3))+[@WEEK]*7
I bet RM can do this easily with generate attributes operator, but I'm totally stucked.
Thanks for your help,
Pedro
I want to do a time series forecast in data that I wish to group by week and I use PIVOT to do that. The problem is that then I lose the date type and I don't want it. Before Rapidminer I use to use Excel to transform year and week numbers in date (as first day of each week). By doing this I get a monotonic time series.
In Excel I used to calculate the first day of each week like below, but I want to perform all data prep in RM and honestly I'm getting nowhere.
=DATE([@YEAR],1,-2)-WEEKDAY(DATE([@YEAR],1,3))+[@WEEK]*7
I bet RM can do this easily with generate attributes operator, but I'm totally stucked.
Thanks for your help,
Pedro
Tagged:
0
Best Answers
-
yyhuang Administrator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data ScientistBTW, I am not Excel expert so I did not understand the excel formula...
If you need just a monotonic time series, try the handy operator "create example set" for date series.<?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="utility:create_exampleset" compatibility="9.2.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="715" y="34"> <parameter key="generator_type" value="date series"/> <parameter key="number_of_examples" value="100"/> <parameter key="use_stepsize" value="true"/> <list key="function_descriptions"/> <parameter key="add_id_attribute" value="false"/> <list key="numeric_series_configuration"/> <list key="date_series_configuration"> <parameter key="value" value="2009-01-01 00:00:00."/> </list> <list key="date_series_configuration (interval)"> <parameter key="time" value="2009-01-01 00:00:00.1.day"/> </list> <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/> <parameter key="time_zone" value="SYSTEM"/> <parameter key="column_separator" value=","/> <parameter key="parse_all_as_nominal" value="false"/> <parameter key="decimal_point_character" value="."/> <parameter key="trim_attribute_names" value="true"/> </operator> <connect from_op="Create ExampleSet" from_port="output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
6 -
IngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM FounderHi @OprickOk, now I got it :-)This can be done by first combining the two numerical columns into one nominal column. I am using the operator Generate Attributes for this so that I can also make sure that there is a leading zero for single-digit week counts. Then, I transform this new column into a date column with the operator Nominal to Date and the format yyyy/ww. If you set the locale parameter of this operator to the desired zone, you will get the correct first day of the week as date right away.If you don't do that (or use the wrong locale for other reasons), you can add another Generate Attributes to adjust the date by adding one day to change it for example from US standard to European standard. I left this operator in there for demonstration purposes.Process is below. Hope this helps,Ingo
<?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"><br> <context><br> <input/><br> <output/><br> <macros/><br> </context><br> <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"><br> <parameter key="logverbosity" value="init"/><br> <parameter key="random_seed" value="2001"/><br> <parameter key="send_mail" value="never"/><br> <parameter key="notification_email" value=""/><br> <parameter key="process_duration_for_mail" value="30"/><br> <parameter key="encoding" value="UTF-8"/><br> <process expanded="true"><br> <operator activated="true" class="utility:create_exampleset" compatibility="9.2.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="34"><br> <parameter key="generator_type" value="comma separated text"/><br> <parameter key="number_of_examples" value="100"/><br> <parameter key="use_stepsize" value="false"/><br> <list key="function_descriptions"/><br> <parameter key="add_id_attribute" value="false"/><br> <list key="numeric_series_configuration"/><br> <list key="date_series_configuration"/><br> <list key="date_series_configuration (interval)"/><br> <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br> <parameter key="time_zone" value="SYSTEM"/><br> <parameter key="input_csv_text" value="Year, Week 2019, 10 2019, 9 2019, 8 2019, 8"/><br> <parameter key="column_separator" value=","/><br> <parameter key="parse_all_as_nominal" value="false"/><br> <parameter key="decimal_point_character" value="."/><br> <parameter key="trim_attribute_names" value="true"/><br> </operator><br> <operator activated="true" class="generate_attributes" compatibility="9.2.000" expanded="true" height="82" name="Generate Attributes" width="90" x="179" y="34"><br> <list key="function_descriptions"><br> <parameter key="Year_Week" value="if ([Week] < 10, [Year] + "/0" + [Week], [Year] + "/" + [Week])"/><br> </list><br> <parameter key="keep_all" value="true"/><br> </operator><br> <operator activated="true" class="nominal_to_date" compatibility="9.2.000" expanded="true" height="82" name="Nominal to Date" width="90" x="313" y="34"><br> <parameter key="attribute_name" value="Year_Week"/><br> <parameter key="date_type" value="date"/><br> <parameter key="date_format" value="yyyy/ww"/><br> <parameter key="time_zone" value="SYSTEM"/><br> <parameter key="locale" value="English (United States)"/><br> <parameter key="keep_old_attribute" value="true"/><br> </operator><br> <operator activated="true" class="generate_attributes" compatibility="9.2.000" expanded="true" height="82" name="Generate Attributes (2)" width="90" x="447" y="34"><br> <list key="function_descriptions"><br> <parameter key="Adjusted_Year_Week" value="date_add([Year_Week],1,DATE_UNIT_DAY)"/><br> </list><br> <parameter key="keep_all" value="true"/><br> </operator><br> <connect from_op="Create ExampleSet" from_port="output" to_op="Generate Attributes" to_port="example set input"/><br> <connect from_op="Generate Attributes" from_port="example set output" to_op="Nominal to Date" to_port="example set input"/><br> <connect from_op="Nominal to Date" from_port="example set output" to_op="Generate Attributes (2)" to_port="example set input"/><br> <connect from_op="Generate Attributes (2)" from_port="example set output" to_port="result 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="sink_result 1" spacing="0"/><br> <portSpacing port="sink_result 2" spacing="0"/><br> </process><br> </operator><br></process>
7
Answers
Ingo
It is a little different that Sunday is usually the starting of a week in USA. But if you need the first day of the week to be defined as Monday, it is a simple fix.
My process use "date to numerical" to extract the day relative to week first. If the date is already Sunday (day 1), I will keep it the same. Otherwise, the formula will generate a new date:
date_add(date,-rint(date_day-1),DATE_UNIT_DAY)
@IngoRM I believe that is due to different date settings, but what the formula does is the following.
DATE is built using YEAR and WEEK and Date is the first day of the corresponding week and year.
@yyhuang your reply is not exactly what I was after, but in fact that is a better approach. Since I also have date (day)... I can use your method.
And surely I'll take your advise and take a look to Create Example Set operator.
This solves my problem.
But if you still have a clue on how to build first day of week from week and year please I would like to know.
Thankful both to you
@IngoRM. Thank you!
Now seems too obvious
Oprick