read csv file skip first n lines

Telcontar120 · November 2017

The Read CSV operator should be given a parameter option to skip the first n lines (often header lines).
While there is already an option to allow for skipping comments, if the lines do not have a comment indicator, that requires users to manually go in and modify the lines in the file, which is not efficient for automated processing of large numbers of files.
Instead, if the operator could automatically skip the first n lines and then take the header from the n+1 row and read all data normally thereafter, it would drastically improve efficiency of working with csv files.

sgenzer · November 2017

MartinLiebig · November 2017

Brian,

you can put the first n-lines to "Comment:

that should do the trick.

Telcontar120 · November 2017

@mschmitz Thanks, this actually pointed me to the answer. If you don't want to run the wizard (which I wanted to avoid since it was going to be in a loop using the "file" input rather than pointing to a specific file), I think you can still accomplish the same thing by using the "Annotations" parameter and setting the first lines to comment, like so:

I was getting hung up before because there is a separate parameter for a comment character, which I didn't want to have to add manually, but I tested using this method and it appears to work, starting the import on the specified line and taking the correct number of columns from that. So thanks for the pointer!

sgenzer · November 2017

workaround available

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

read csv file skip first n lines

Fixed and Released · Last Updated May 2019

Comments