The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[SOLVED] Splitting nominal attribute values by unparenthesized commas
tennenrishin
Member Posts: 177 Contributor II
Hi
I would like to split a nominal attribute into multiple attributes. The nominal values need to be split by all the internal commas, except for those commas that are inside parentheses. The same way one would split a function argument list into the arguments (which may themselves contain function calls).
Does anyone have any ideas for what regex I could use to match those commas, or any other way to perform this split?
I would like to split a nominal attribute into multiple attributes. The nominal values need to be split by all the internal commas, except for those commas that are inside parentheses. The same way one would split a function argument list into the arguments (which may themselves contain function calls).
Does anyone have any ideas for what regex I could use to match those commas, or any other way to perform this split?
0
Answers
I'm thinking of...
1. Replacing all commas and parentheses within substrings that match \([^\(\)]*\) by respective special tokens.
2. Repeating step 1 until there are no more parentheses (or simply for max_depth number of times)
3. Splitting by commas
4. Replacing those special tokens back with their original characters again.
But step 1 requires a capability to search and replace within all substrings that match some given regex. Is there a way to do this?
Any help appreciated.
,(?!([^\(\)]*\(([^\(\)]*\(([^\(\)]*\([^\(\)]*\))*[^\(\)]*\))*[^\(\)]*\))*[^\(\)]*\))
with the assumption that nesting does not exceed a depth of 3 levels.
It seems to be working but of course it is not easy to test comprehensively. Can you spot any obvious mistakes? Is it unnecessarily complicated?
Here is a more readable version:
,(?!
(
[^\(\)]*
\(
(
[^\(\)]*
\(
(
[^\(\)]*
\(
[^\(\)]*
\)
)*
[^\(\)]*
\)
)*
[^\(\)]*
\)
)*
[^\(\)]*
\)
)
Best,
Marius