Hello People, I have a map reduce script that crea...
# suitescript
a
Hello People, I have a map reduce script that creates and merges a bunch of pdf and store them in file cabinet. The pdf creation process is happening in map stage and merging of pdfs is in reduce stage. Here is the thing, I now want to upload these merged pdf files on SFTP. What is more efficient? To create a SFTP connection in getInputData stage and pass the connection Object to map and reduce once, upload files to SFTP in reduce OR create connection with SFTP in reduce stage? Creating connection first means that it is possible that map stage takes a little too much time for execution and I have a risk of connection time out. Creating connection in reduce stage means that I'll be creating connection for almost 700-900 times. Which approach is better?
v
I would be very surprised if the 2 stages persisted the connection. There's probably no guarantee that the next stage even runs on the save server/pod, so the socket would not persist. I personally have written something similar, and I created a completely separate process to do all of the file uploads & had that run after the first M/R
s
I'd tend to agree with @verikott and take the separate process approach
a
That's very interesting @verikott Thank You
b
wouldnt matter, the sftp connection that is used to check the credentials is not the same sftp connection used to do any of the operations
a
@battk can you please elaborate?
b
a ssh connection needs to be created and later closed to use sftp
theoretically, if you were being efficient, you would open the ssh connection, and use the same connection to upload all your file, and then close it when you are done
netsuite takes the opposite approach and opens and then closes the connection for each upload
a
so I would need to create connection each time I am uploading a single file?
b
keep in mind, despite the similar name, a ssh connection is not the same as a sftp.Connection
you are in charge of creating that sftp.Connection, netsuite is in charge of maintaining that ssh connection
netsuite's choice on how it maintains that ssh connections means that there isnt much point to trying to resuse the sftp.Connection
the sftp server wont notice since netsuite is constantly creating and closing ssh connections