The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to extract YEAR from a string?
Hi,
I have this attribute in a dataset that is a string of text :
How do I create a new attribute, say Vintage and get to have only the year, e.g. 2006 in this case as the value of the new attribute?
Am using generate attribute operator, but can't quite work out the syntax correctly...
Thanks.
I have this attribute in a dataset that is a string of text :
- Name: "Angove's 2006 Red Belly Black Shiraz (South Australia)"
How do I create a new attribute, say Vintage and get to have only the year, e.g. 2006 in this case as the value of the new attribute?
- Vintage: 2006
Am using generate attribute operator, but can't quite work out the syntax correctly...
Thanks.
0
Best Answers
-
kayman Member Posts: 662 Unicornuse regex. If there are no other numbers in your string it is pretty easy, then you use someting as
replaceAll([myField],"\\D","")
Read as 'remove everything that's not a digit', so what is left will be your year.
If there are other numbers you can use a range, assuming that your years will go from 2000 to 2019 you could use something like
replaceAll([myField],"^.*?(20[0-9]{2}).*$","$1")
wich reads as 'start at the beginning of string, and if you find something starting with 20 followed by 2 other digits, store it and remove everything else.
If you can also have older years you could try as follows :
replaceAll([myField],"^.*?([12][0-9]{3}).*$","$1")
so now you look at a patters starting either with a 1 or a 2, followed by 3 other digits.
Not foolproof depending on your data, but it might do what you need
7 -
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 UnicornYou may also want to consider trying to parse your text field even further to separate out other information such as the geography, the vineyard, etc. You can use Split or Tokenize for that purpose.6