Just to confirm: the only risk of increasing the C...
# suitescript
r
Just to confirm: the only risk of increasing the Concurrency Limit on a Map/Reduce script deployment would be if that causes us to hit our concurrency limit and queues other scripts, correct? The other scripts would still run later though once the M/R finishes? I'm reasonably sure this increase I want to implement won't cause us to hit our limit (been monitoring APM and we're nowhere close to hitting it now), but want to make sure there aren't unintended consequences. Thanks!
e
Be aware that the Map/Reduce will always reserve that many processors, whether it actually needs/uses them or not. If this is something that's running frequently, you might be unnecessarily blocking a bunch of processors. Ran into this on a previous team where multiple scripts that were set to run every 15 minutes were all set to 5 or 10 Concurrency Limit but weren't taking advantage of them. Our processors were constantly backed up.
👀 1
r
Thanks for the reply @erictgrubaugh. These scripts are set to run once a day overnight when we have little to no other processing going on and in general our environment is very lightly scripted other than these scripts in question. I would be changing this setting on a bunch of map/reduces, but they're all set up to not start until the previous one in the chain is done, so there's never more than one running at once. Just to confirm.... > Our processors were constantly backed up In the event what I'm proposing would back something up, all the stuff would still run eventually once the M/R is done right? I'm just worried about stopping something completely from running. Thanks!
e
The others will eventually run if the processors ever free up. If you're doing this on one script which runs once a day, overnight, it seems extremely unlikely to clog things up.
r
Appreciate the input, thanks!
✅ 1
w
Be aware that the Map/Reduce will always reserve that many processors, whether it actually needs/uses them or not. If this is something that's running frequently, you might be unnecessarily blocking a bunch of processors. Ran into this on a previous team where multiple scripts that were set to run every 15 minutes were all set to 5 or 10 Concurrency Limit but weren't taking advantage of them. Our processors were constantly backed up.
@erictgrubaugh Wouldn't the scripts be completed faster if using 5-10 instead of one? (given that a big part of the script was in map or reduce stages) Why did it get better by lowering the limit? I guess you could have multiple script running at the same time but with a lower throughput? M/R's don't reserve all processors (according to limit) when getInputData is put into the queue. Right? getInputData will only use one processor map/reduce will use the ones that are available (considering priority) but it will always split the work over all 5/10. summarize only uses one processor.
e
They were set to run every 15 minutes, so every 15 minutes, something like 10 scripts all tried to reserve 5 - 10 processors. We only had 15 processors, so all of those competing scripts were backing things up for everything else. Reducing both the concurrency limit and frequency of those 10 scripts reduced the majority of that constant traffic jam. Additionally, if your script is set to "Submit All Stages At Once", then it will attempt to reserve all the processors up front.
Generally a script that's running every 15 minutes isn't a very long-running script, and that was true for these scripts as well, so dropping them down to 1 Concurrency might have made them take slightly longer individually, but the system bandwidth was improved far more by keeping those processors free.
w
Ok, I can see that they probably execute faster if there are very few records to process as it seems to take some time to spin up a processor. I don't think that it reserves the processors even if they submit all stages at once. If I launch one m/r script that uses all 5(in my case) processors and it takes a while for the getInput to complete, if another M/R is fired off while the first one is running GetInput and it completes it's getinput before the first, the second one will run it's map instances before the first.
e
Interesting; perhaps we just happened to have fast-executing GID stages in this instance. Regardless, though, I think the larger point is to be mindful with your Concurrency Limit choices. Don't just set them all to the max and think that will make everything run fast and smooth.
w
Completely agree when it comes to often executed scripts.