The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
New guy - learning how to clean up and parse a CSV file
I am new to data analysis and preparing a data set. I have a CSV file with the time stamp in the following format: "2018/05/14:12:00:00 PM."
What is a recommended way to "parse" the time stamp into Year, date, and time components to make it easier to sort and filter?
Thanks!
What is a recommended way to "parse" the time stamp into Year, date, and time components to make it easier to sort and filter?
Thanks!
Tagged:
0
Best Answer
-
sgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Managerhi @WesCo2019 - you're going to want to first put it into a date-time data type:
As for parsing, I would use "Generate Attributes" to create separate attributes for each piece you want:<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.0.003" expanded="true" height="68" name="Retrieve Lake Huron" width="90" x="45" y="34"> <parameter key="repository_entry" value="//Samples/Time Series/data sets/Lake Huron"/> </operator> <operator activated="true" class="generate_attributes" compatibility="9.0.003" expanded="true" height="82" name="Generate Attributes" width="90" x="179" y="34"> <list key="function_descriptions"> <parameter key="Year" value="date_get(Date,DATE_UNIT_YEAR)"/> <parameter key="Month" value="date_get(Date,DATE_UNIT_MONTH)"/> <parameter key="Week" value="date_get(Date,DATE_UNIT_WEEK)"/> <parameter key="Day" value="date_get(Date,DATE_UNIT_DAY)"/> <parameter key="Hour" value="date_get(Date,DATE_UNIT_HOUR)"/> </list> </operator> <connect from_op="Retrieve Lake Huron" from_port="output" to_op="Generate Attributes" to_port="example set input"/> <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
Hope that helps.
Scott
5
Answers
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts