We are looking to move just under 3 TB of files fr...
# general
b
We are looking to move just under 3 TB of files from the Netsuite file cabinet to amazon s3. We are looking to build a script to do this which would move the file to s3, grab the new proof image url, and then update the customer record with the new url. Our estimates are showing us 2.5 years for the script to process all 8 million images we have. Does anyone have a recommendation/tip to speed this process up. On average it will take about 10 sec per image to process. Ideally we would like to accomplish this task in 60-180 days.
d
I'd probably start with pulling 'some' files down to some local storage and see what kind of time estimates that gives you vs straight into s3 (just to see if there is a serious bottleneck there). Also, maybe log the record ID's and filenames to a multiple csv's then you can do a map\reduce with multiple processors to update the records. I'm interested to see what the best option turns out to be for this.
p
Where is the time being spent? Are you doing it serially? I'd have thought the best thing to do would be some sort of bulk upload to S3 in one go, then a CSV (with scripts disabled) to update the customer records. I guess it depends on how the resulting URL is mapped once the file is plonked onto S3 (is it a UUID in a bucket?)
m
You could make this really on yourself and consider using an integration to move the data. Celigo is pretty good for this. https://www.celigo.com/integrations/amazon-s3/