How do y'all handle low throttling limits for exte...
# suitescript
e
How do y'all handle low throttling limits for external APIs in large integrations? I've got a M/R processing hundreds or thousands of records every morning, making one call to an external service per record. The external service allows 120 requests per minute, and I'm constantly smashing into the limit, even when limiting my M/R to one processor.
j
Create a 'queue' custom record instead of hitting the API directly. Use a separate process to manage the queue. Problem can always come with backlog... if the NS process works much faster than the external, and you process enough records in NS but can't push them to the external fast enough, you'll never be on top of the backlog.
m
Quick and dirty (and admittedly crappy) idea: add an otherwise-pointless load/save of a record that you know takes at least a second to process right after that API call to slow that phase down.
e
I'm driving the M/R with a search of Drop Ship POs for a specific Vendor which are Pending Fulfillment, and checking the status of the corresponding PO in the external system to see if they've fulfilled it. So I'm pulling data down from the external API, and there's nothing for the script to do without that data. There is a request to pull a list of POs from the external system, but I can't filter it in the request for specific orders, so I'd always be searching through the entirety of this ever growing list, sending more and more requests just to get the complete list. I'd prefer of course to avoid injecting artificial delays.
b
the delay is not really artificial, your external api demands it
❤️ 1
m
Agreed on your preference for not wanting to inject artificial delays - I definitely wouldn't want to do what I suggested if I could avoid it, and I almost didn't even mention it for that reason. But it should work and wouldn't take very long to implement, so that's something at least. There's probably a better/cleaner way to slow down a script, but I thankfully haven't run up against that need in our instance so I personally don't have any other suggestions.
s
How about if instead of a full M/R you sync into a workflow which initiates on demand/or scheduled (same workflow per record), and with workflow action scripts which will execute that? This might segregate enough so your concurrency drops below 120 per minute, depending on how many records you have
s
I think the workflow scheduler has a minimum of 30 minute delay?
s
it does, but i'm not stating that, i'm saying, have something which triggers the workflow (could be create/update) and it'll trigger a workflow+wfas that will start the process (but that'll be per record) instead of doing it via batch
s
I've never condoned hacks for this that burn NS resources.. however one thing that pops to mind is perhaps creating an external webhook that takes a delay parameter (perhaps an integer of ms)
s
i agree, webhook was the first that came to mind, but, it depends on the technology allowed
s
I have to presume that an outbound
n/https
is blocking/async from the NS server perspective
and hence should not consume server resources whilst waiting for response.
(well, only consume the TCP handle and such)
I think that would accommodate short delays, up to 5 minutes (isn't that the max response time NS will wait for an http request to respond?)
s
i feel like if there is a throttling limit real time and it's being hit, i can only suppose that it can't be run real time and that was the reason the m/r was deployed as such? the NS concurrency limit is quite pricey so.. I guess i understand if thats the case? is exploring running real time updates a possibility?
s
the throttling is more likely to avoid abuse and to even out load
IIRC some Amazon APIs have this 120/minute limit
s
i feel like then the problem is the load balancing in NS, what you'd be saying is if you could lower the MR script to only run 120 records or less you'd be okay, and that's saying you're calling the same URL with the MR script concurrency of 1
i was gonna say something stupid like if you could hit a https.promise, that might give you a chance, but i probably aren't hitting the question properly haha
s
@erictgrubaugh another rough approach could be to run in small batches - small enough that you know the time between executions would be sufficient to keep you under the 120/minute limit. e.g. run 120, exit. run again, exit, rinse and repeat?
e
I've capped the search at 100 and shifted to run every 4 hours instead of once per day
will continue reducing the schedule interval if necessary
Luckily it's a relatively low-volume integration, but this would not be practical for anything high-volume (thousands or 10s) per day
d
Hi Eric, have you tried using a script parameter so that you can pass in a list of "pending" record IDs? This way, when the script runs for the first time each morning, you could use your saved search to get the initial list of pending records (check the parameter value and if empty use the search), then in the map stage query the external service. When you run into the limit, pass the failure result to the summarize stage, compile the failed IDs into a list, and re-queue the script, passing the failed ID list via the parameter.
You might have the same IDs fail multiple times, but the script should be able to re-queue itself until the "failure" list eventually empties.
You could also consider setting a "try" limit for each ID, just so that the script doesn't go into an infinite loop (if the external server goes down, for example)
e
It seems like that's what already happens: 1.
getInputData
- Search finds Pending Purchase Orders 2.
map
- For each PO from 1, Query external service for status; pass Fulfilled PO IDs on 3.
reduce
- For each Fulfilled PO from 2, transform corresponding Sales Order to Item Fulfillment Any that either fail or simply are not fulfilled yet during
map
will remain as Pending POs, so the next time the script runs, it will find that one again. For the time being, capping my
getInputData
at 100 results (i.e. under the limit) and running more often allows me to run in parallel and should keep up with the volume. At a larger volume, perhaps your solution is better
d
Ok, I see now, I didn't realize that you were transforming the records, and therefore that those records wouldn't appear in subsequent runs. Are you also considering the possibility that the same transaction might fail multiple times for unrelated reasons? For example, say you query 100 results, and for whatever reason the first 10 are "blocked" in pending fulfillment (e.g. out of stock, etc.). The remaining 90 are fulfilled, so you transform those. Next time the script runs, the same 10 might still be pending, and this time another 10 of the 90 "new" records are blocked. Depending on real-world conditions, eventually you might end up querying the same 100 orders multiple times if they all contain the same out of stock item, while other orders (not included in the 100 you're getting as search results each time) are being fulfilled.
e
Yeah all that is taken care of by the search
(i hope) 😄
The search is responsible for figuring out what to query the external platform for
the Map/Reduce is just a dumb transformer 🙂
All good thoughts though