The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to delete a file from a S3 Bucket folder
Hi,
We use Amazon S3 to load data into Amazon Redshift Database. After data is loaded we want to clean up the S3 text files. I see the Read, Loop, and Write S3 operators. Does RM allow Deleting files from S3, considering that I have access to delete in S3? Otherwise any workaround suggestions?
Thank You
We use Amazon S3 to load data into Amazon Redshift Database. After data is loaded we want to clean up the S3 text files. I see the Read, Loop, and Write S3 operators. Does RM allow Deleting files from S3, considering that I have access to delete in S3? Otherwise any workaround suggestions?
Thank You
Tagged:
0
Best Answer
-
sgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Managerhi @PBM yep that is pretty much the only way to do it. You can also install the AWS CLI and use the Execute Program operator to run shell script commands.
It would be a logical feature request...
Scott
5
Answers
Ok it seems that this is a case where RapidMiner does not have the object but then you can extend it by using external scripts. In this case I have used the Execute Python script and boto library.
from boto.s3.connection import S3Connection
AWS_ACCESS_KEY='myAccesskey'
AWS_SECRET_KEY='mySecretKey'
path_to_file='mysubFolderPath'
# Create connection
conn = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY)
# Connet to my bucket
bucket = conn.get_bucket(S3_BUCKET_NAME)
# Get subdirectory info and delete files (except the subdirectory itself)
for key in bucket.list(prefix=path_to_file, delimiter='/'):
if key.name != path_to_file:
bucket.delete_key(key)
rm_main()
Thank You