Curious about M R script overhead We re doing a scrubbing pr NetSuite Professionals #suitescript

Curious about M/R script overhead. We're doing a s...

Fred Pope

06/27/2024, 8:56 PM

Curious about M/R script overhead. We're doing a scrubbing project on a system. Using a Map/Reduce script, I was able to send data to a Business Logic module and update ~3600 records an hour (about 1/sec) Using the same Business Logic module and calling from a Restlet, I was able to update 50K records an hour. I've disabled everything on the system. Makes no sense why it would be faster via the restlet. Wondering if anyone else has had a similar experience. The way I counted was the number of records updated.

ericl

06/27/2024, 9:30 PM

Curious when you say Business Logic module, is that an external system that you're making an API call to?

ericl

06/27/2024, 9:31 PM

Also, what is the concurrency limit on the M/R script.

Mike Robbins

06/27/2024, 10:22 PM

On the records you're updating, if you have user event scripts that run in the Map/Reduce context but those same scripts don't run in the Restlet context, then the Restlet method is doing way less work. It's still updating the records, but all of the work for the user event scripts tied to those records may not be firing.

Fred Pope

06/28/2024, 4:20 AM

@eblackey they're just module scripts to keep the logic isolated from the entry points and data access. Makes for easier testing and better readability in my opinion. We access the business module file by calling it from any entry point (Restlet, Scheduled, Map/Reduce) Concurrency on the M/R is 15, but you can oversubscribe them to 30.

Fred Pope

06/28/2024, 4:21 AM

@Mike Robbins I think I've scoured the UE scripts already, but your making me think I may need to take a closer look. Thanks for the nudge.

scottvonduhn

06/28/2024, 1:11 PM

Well, it has been years since I did my own performance test / comparison, but back in 2014 I determine that Restlets scripts did indeed perform faster that just about any other script or integration method at the time (faster that scheduled scripts, SuiteTalk Web Services, CSV imports, etc.) Of course, at that time it was before SuiteScript 2.x came out, so Map/Reduce wasn’t an option and SuiteTalk REST wasn’t available yet either. But over the years we have created many Restlets and the throughput we see with them is impressive, so this doesn’t surprise me. However we also use Map/Reduce scripts a lot, and they can perform very well, but we have noticed that the getInputData stage can actually end up being the bottleneck for Map/Reduce script when the data to retrieve is very large, like over 30,000 records. For some reason, the time it takes to get all of the data and send it to the next phase does not scale well above a certain point, and you may find that 45,000 records could take more than twice the time of 30,000 records in the getInputData phase (that’s just an example, it is going to depend on a lot of factors specific to your account and your data). But I will say if you are dealing with tens of thousand of records or more, it is worth experimenting with limiting the GID phase to a certain amount, and see where the sweet spot is for that script. I have to limit most of mine to the 30-40 thousand range, as that seem to be where we get the best throughput, before it degrades. As mentioned by others, workflows, user events, and even client scripts (yes client scripts can run server-side!) can all fire for certain contexts that might only affect one script type but not another, so that’s worth a look too, to make sure you are doing a real comparison of the scripts, and not other customizations being triggered by them.

Fred Pope

07/02/2024, 12:12 PM

From what I am seeing the M/R is bottlenecking on the overhead of managing a large dataset. (> 1m records). Running and managing this scheduling and coordination outside of Netsuite works well.

scottvonduhn

07/02/2024, 1:33 PM

Exactly, this tracks with my experience as well. While the latter stages perform very well with any amount of data, the first stage need to be tuned/optimized. One thing to do is trim out any columns or fields form your search or query that aren’t used. Alternatively, running multiple executions of it with fewer results is what works for us. We use a control script to manage that in some cases.

✅ 1

35 Views

Open in Slack

Previous Next