deleting files based on file size
I'd like to get RapidMiner to look in a specific directory and automatically delete files based on their file size (i.e., delete files below a minimum size), but I don't see any options around checking file size in the Loop File or related operators. I also see there is a Delete File operator but it seems to require you to point to a specific file by name.
Is this functionality to use file size present elsewhere, or am I missing some other way of handling it? Or is this not an option within RapidMiner? Thanks!
Best Answer
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
Hi @Telcontar120,
I think it will be very difficult to perform this task with RapidMiner's native operators.
So I propose once again a solution with a Python script (and a Loop Files operator).
To execute this process, you have to set :
- the minimum size of the files (in Octets) you want to delete in the Set Macros operator.
- Of course, set the path where your files are stored in the Loop Files operator parameters.
I hope it helps,
Regards and happy deleting
Lionel
3
Answers
Hello @Telcontar120,
Is your machine a Linux one, a UNIX one or otherwise has access to findutils? If so, you can execute this command:
find /home/telcontar120/.RapidMiner/path/for/your/data -type f -size +900k -size -1000k -iname "*.csv"
Where -type f means files, -size +900k means files that are larger than 900k, and -size -1000k means files that are shorter than 1000k, and with an insensitively cased name of "*.csv".
Hope this helps.
Hi again @Telcontar120,
I forget to attach the process in my last post :
Regards,
Lionel
@rfuentealba unfortunately no Linux here, just a simple Windows machine.
@lionelderkrikor thanks for the python script, that should do the trick!
Too bad there is no native RapidMiner operator for handling this.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
P.S. I've added this as a new product idea in that forum, so if you think being able to deal with file size inside RapidMiner natively would be a helpful feature, please go over there and vote for that idea!
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
yep already opened it for voting and created internal ticket for dev team. Thanks @Telcontar120 for the suggestion!
Scott