Hi all, look to bounce an idea off everyone. I've...
# suitescript
d
Hi all, look to bounce an idea off everyone. I've got a MR script that is processing a large data set. The reduce is iterating an array of objects .. that produces timebills. I'm running our of units in the reduce. What if I moved the record creation of the reduce into a restlet ... and call the restlet (asynchronously) from the reduce. Think that might gain me some units? @creece? (most valued opinion)
c
if you're running out of governance I would look at how you are getting the data and if you can pair it down in a map phase... is this possible? I would assume you've already taken a look at this though but just a thought.
d
Ya, looking to optimize that a bit too .. but I'm using the map to
group
records for the
reduce
c
I would start there in your optimizing and see if you can group it into smaller chunks based
👍 1
like if you know you can only process X, you should be able to see how many keys/values you are processing/already processed in the map phase and maybe add a key that has an updated key based on the set like "keyA-set1", "keyA-set2",,, may have to get out some paper and calculate the governance requirement per iteration to get it right.
The RESTlet would buy you 5000 extra units but that's not scalable if you were to happen to have even more that's why I suggest trying to optimize your reduce data set in map.
d
Ya, I've done something similar when chucking up csv data (I'll review that) using modulo
s
RESTlet is 5000 IIRC
a
yeah modulo chunking for the win.. just get the last digit of and internal id or something and split it into 10 chunks.... or if you really need to get the last 2 digits and make 100 chunks 🙂
d
Well, in some (if not most) you can use mapContext.key for the modulo calculation for chunking. Its pretty sweet
a
right, and you add the lastDigit variable to make more keys 🙂 ... its a good "pseudo random" way to evenly distribute to any number of chunks, you don't have to think about well is this grouping going to end up bigger than this grouping... with a good size dataset you can be pretty sure the queues will be at somewhat evenly split
technically you can use modulo any number you like, but ppl like base 10 🙂
c
Yeah it's 5000 I haven't dealt with governance concerns in a while was thinking suitelets for some reason. I'll edit my comment.
m
Another option to implement the data fetching and grouping yourself in getInputData and free up the map for processing.
c
If you're gonna go that route you should use reduce as it has more governance and not use the map phase for processing