The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Can I convert a table in a text file to Excel using RM?

kimjkkimjk Member Posts: 19 Maven
edited March 2023 in Help
For example,

   name              age               job                                          
--------------- ---------------- --------------- 
   aaaa               bbbb             ccc        

   des1               des2             des3        
--------------- ---------------- --------------- 
      1                    2                3


                                      

   dddd               eeee             fff        

   des1               des2             des3        
--------------- ---------------- --------------- 
      4                    5                6


There is a text file with many tables of this type, and these to..



   name              age               job            des1               des2             des3
   aaaa               bbbb             ccc              1                    2                3
   dddd               eeee             fff               4                    5                6 

I want to convert it to an Excel file of this type.

I tried filter with regular expression but I can't get the idea out.

Any help or even a little hint is appreciated.

Tks

Best Answer

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Solution Accepted
    Hi!

    Your tables are formatted for viewing, not processing. But you can still import them.

    I would try Read CSV with a weird separator character that is not in the tables, like § or *. This gives you the entire rows in one column.
    Then apply Trim to remove spaces in the beginning and the end of the rows. Then Split with a regular expression like "[\t ]+", this will split up the column on the variable number of spaces (or tabulators if they are in the data) . If the Name or some other attribute have (single) spaces in the contents, try a regular expression with a quantifier:  {2,}|\t

    Then you can rename the columns as you need. 

    When you're done, you can use Write Excel to export the results.

    Regards,
    Balázs

Answers

  • kimjkkimjk Member Posts: 19 Maven
    tks! I've used over 30 operators, but it worked anyway.
Sign In or Register to comment.