The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Text Processing - Cut Document - Similar entries separated by a number
I was having trouble finding the operator documentation that pertains to string matching or cutting documents in general.
I have a few different types of documents (.xml, .csv, .docx, .html) that list records, in order, separated by *Record (n)* in ascending numbers, starting with 1.
Each of these records has similar attributes but it's all unformatted other than the records and attributes being separated by asterisks*.
My hope was to cut the document by record, which I assumed I could do with a string matching query, but I'm not sure how I could do that if each record is different, and the only commonality being the record #, but that's variable so not sure how to input that expression.
Tagged:
0
Best Answer
-
kayman Member Posts: 662 UnicornAre your records each time on a new line?
like :
Record 1*something*someting else*and again something else
Record 2*something*someting else*and again something else
Record 3*something*someting else*and again something else
or is it more like
Record 1*something*someting else*and again something else*Record 2*something*someting else*and again something else*Record 3*something*someting else*and again something else
In case of the first you could simply use the read csv operator and use the * as the separator. Beware that this is a special character that needs to be escaped, so in order to use it correct you need to enter \* instead of just *
You could also use the split operator, same here. Use \* to make clear you want to split on the 'normal' asterix.
If all is in one line I recommend to use the split document into collection from the toolbox extension.
I've attached some samples to play around with, hope they get you started.
2
Answers
Is the number of rows each time the same, like in your example for instance 7 rows, then next 7 for a new topic and so on?
If so you could also use a loop logic and filter each time 7 records on every 7th entry using a mod logic. Sounds far more complex than needed btw :-)