If i have a map reduce script with one deployment,...
# suitescript
m
If i have a map reduce script with one deployment, if I “schedule” it via the N/task, is there a limit on how many instances of it I can submit via the task?
c
You can specify the concurrency in the deployment
m
Are you sure that is what that concurrency means? I thought it meant something else, but even if so, I believe there is an upper limit of 5 on that concurrency in the env im working on and that is still an issue. I thought that when you “schedule” it just goes into a queue. Only one at a time processes but the rest remain in queue.
Use this field to set the number of processors available to the deployment. This value equates to the number of jobs submitted for the map and reduce stages. - I think this dictates if you get 1 or more values at a time in the map/reduce
c
I've just been working on one. I submit a task using N/task and a named deployment. If I try and schedule it when it's already running I get a
MAP_REDUCE_ALREADY_RUNNING
exception
m
exactly, that is the issue im running into
m
You can create multiple deployment records for the script. When you use the N/task module to schedule it, don’t specify a deploymentId and NetSuite will use the next available script deployment. If you have 5 deployments, you can execute 5 instances of the M/R script.
m
is there a limit to the number of deployments I can have on the MR script? This is used to process invoices of work orders (custom record) so I am not exactly certain how many work orders will need to be invoiced at the same time (not talking about batch invoicing)
So is it a true statement that if I create 50 deployments, technically that script can be queued/ran 50 times in concurrency?
c
Depends on the number of suitecloud+ licenses you have
It's a map/reduce, it'll parallelise anyway
m
yes i dont need it to process 50 instances at the same time, i just need it to queue up and process one at a time
what i dont want to get is that error that you got which stops the invoicing in my case
s
Having 50 deployments of the same MR is bad design IMO, but yes you could technically do that
m
Why is it bad design? My use case is as follows: I have a custom record (work order) that can generate up to 10 invoices (diff billing customers, billing types, etc) that process takes at best 30s and at worst in the minutes. I don’t want it to be a user event / suitelet because the user clicks invoice and then has to wait. I moved all that logic to a MR script, so the experience for the user is instant, they click invoice, it triggers the MR script passing the workorder id as a parameter and the script runs. The same script can handle an array of work order ids (future batch invoicing). Now, two users can try to invoice 2 work orders at similar times which currently with 1 deployment will run into issues. Having multiple deployments seems like will prevent the error from happening. I am absolutely open to better design. Although the invoicing doesn’t have to be “immediate/realtime” but i cannot have a scheduled script that runs at midnight for example to mass invoice workorders.
s
If the process does not need to happen immediately, you could implement some sort of queue and just have the MR check the queue for what id's need to be made. MR's can run way more frequently than 1 time per day
Now, two users can try to invoice 2 work orders at similar times which currently with 1 deployment will run into issues. Having multiple deployments seems like will prevent the error from happening.
I feel liek this exact problem would be why you would not want multiple deployments. 2 different people click the button and now you have 2 deployments tryign to create duplicate things
m
well no, it wont be duplicate things, once you click invoice on a work order it marks that work order as processing and cannot be clicked a second time until its invoiced, im trying to prevent the error happening when two people try to invoice 2 different work orders at similar times
although invoicing doesnt have to be “realtime” it is a customer facing process, meaning a minute or two (maybe 5?) is acceptable, making the customer wait up to 15min for an invoice (i think that is the smallest interval you can have a mr run) will not be acceptable by the customer. Everything you said though is the correct design for non-time sensitive processes and I use that for other cases. Maybe I miscommunicated the need for timely invoicing lol
And just to make sure we are both on the same page as why this is a bad design: Issue is that it doesn’t scale well. Meaning i can create 10 deployments and in some instances if more than 10 people try to invoice a work order at the same time i will still get the error. Also, the overhead of queue’ing up those instances and having to wait for one to finish for another to start is worse vs having 1 instance process multiple work orders at the same time. Right?
s
I would think so yes, it's minimal overhead, but it also depends how much other stuff is already running in your system. There are usually lots of processes/jobs that are tryign to run in the background, so queueing up a bunch of the same MR to clog the queue is generally not ideal
m
yes I agree, and have seen that first hand happen in an environment and wasn’t pretty. If the invoicing process wasn’t time sensitive I would definitely not use the queue like that, but forcing a user to wait 1-5min on the screen for the invoices to be created is also not an acceptable solution. I do appreciate all of the help received here, and most of all the fact that not only do yall answer my direct question but also question my question lol
b
i vote queue
making a list of records to process is much better for error handling
m
as in a queue and then run thru that queue every 15min?
b
as in you make your map/reduce do a search or query to make a list of records to process
then process all those records
m
I agree but i feel like making a customer wait at worst 15min before it even tries to process their invoice is a good design for us and a bad one for the customer
b
you dont need to make the user wait 15 minutes
m
as in have a scheduled and on demand deployments?
b
you can have the user event add the task to the queue
m
correct but if the queue processes every 15min, isn’t that the same as making the customer wait 15min (worse case) or are you saying have a queued process every 15min and an on demand one
b
depends on how you want to structure it
the most robust usually means having a scheduled and on demand
👆 1
which means you need to have a plan to make sure that they both dont process the same record at the same time
m
yes, but then that would require for me to make sure that if an on demand one runs, it removes it from the criteria that would match it up on the scheduled one, and then also check for both processes to not write over each other
yeah what you said lol
b
usually through a checkbox that says that the record is being processed that you check and uncheck
if you dont care about the error handling aspect of it, you can just have a single on demand deployment
m
Ok so lets run thru that situation. The only way for you to “mark” a work order as ready to be invoiced is to push the invoice button. This will then add it to the on demand queue. Which negates the need for the scheduled queue. What you are saying is I should have Invoice Now and Invoice “Later” options, for time sensitive ones it runs on demand and for non sensitive ones it just flags it as ready to be invoiced and the scheduled queue takes it
b
i didnt really say there is a now or later aspect to this
m
well then which work order will hit the 15min scheduled queue if all of them get added to the on demand queue?
b
use the scheduled queue for error handling
when the on demand fails
i actually recommend a single on demand deployment
that does a search or query for whatever records it needs to process
have it reschedule itself when its done
make it stop rescheduling itself when its search or query says there are no more records to process
m
but then isn’t that a possible infinite loop, lets say something stupid happens during the invoicing process (that i didn’t know about, think data quality crap) and the invoicing fails and fails to “uncheck” the work order from “to be invoiced” it will indefinitely reschedule itself and fail
now we need to way to have a field on the work order that counts how many invoicing attempts happened and lets say ignore anything after 3 attempts
b
its not actually required to have a field that counts, theoretically you can use system notes
but its much prettier to have a field that counts how long a record is locked
but again, the easiest is to have one script depyoment for normal use
and one for error handling
and not actually mix the two
if you mix them, you need that locking mechanism
m
and when you say error handling are we talking about the unexpected errors such as data quality or are we talking about the other error that started this whole thread about two people invoicing at the same time two different work orders
b
you actually ignore
MAP_REDUCE_ALREADY_RUNNING
it means that your on demand is already running
m
which then means that it will get picked up by the 15min one
b
and will already reschedule itself
you have one map/reduce script, that is for on demand
make it do a search so it processes all the records that need to be processsed
when its done, make it reschedule itself to do the search again
dont make it reschedule itself if there are no records to process
m
which then eliminates the need for a actual scheduled instance of the MR, because it gets triggered from on demand and keeps going until nothing else is there to invoice
and the only modification needed is a way to know when to stop trying if the work order will just always fail to invoice
b
I personally like a separate scheduled map/reduce for records that fail once
the on demand will quickly run through retry attempts before the cause of the problem will be fixed
m
so have 2 on demand scripts, first one attempts if it fails it reschedules the second on demand (error handler) to keep trying separately
b
have the second one for errors, and have an actual schedule
m
so on demand goes thru, tries, if it fails it fails, failed ones get picked up by scheduled
b
anytime you mix multiple deployments processing the same records requires you to have logic to make sure you don't double process the records
m
well that wouldn’t be true if you can only queue up a work order once until that process is successful or fails which allows you to queue it up again
which means i can only create 1 running instance PER work order, and i dont have to worry about double processing
only thing i have to worry is having enough deployments to handle multiple queues
that is my understanding from this thread going towards the beginning
b
there is basically no concurrency related to a script deployment
a script deployment can only be queued or not queued
there is no multiple queueing of the same script deployment
m
so what was said above, having multiple deployments and not specifying a deployment id will not queue a non-running deployment automatically?
b
however, that script deployment can use multiple script processors to run its map and reduce entry points
N/task has an optional deploymentId parameter
if you set it, it tries to schedule that particular deployment
it will error if its already queued
if you dont set the deploymentId, then N/task will try to pick a unscheduled script deployment to schedule
I use the word try because as far as I can tell, N/task doesnt use thread safe algorithms to determine which unscheduled script deployment to schedule
m
yes that is what was said above, so my understanding is that if i have 10 deployments for that script, and don’t specify a deployment id, and lets say deployment1 is running, Ns will schedule deployment2, deployment3, etc. and those will wait for 1 to finish, then 2 starts, then 3 after 2 finishes etc. in other words an actual queue
b
if by finish, you also mean yield, then yes
its possible for deployment 1 to yield and let deployment 2,3,4 and so on run before continuing
m
thats fine though, because each deployment is for its own work order
they never mix the same work order
b
its also possible for script deployment 2,3,4 and so on to run at the same time as 1
your account has multiple script processors
and each one will process things waiting in the queue
m
2,3,4 running at the same time still shouldn’t be a problem since each one only has 1 work order to process correct?
say process 1 has 100 invoices from that work order and yields at 50, then process 2 starts creates its 10 invoices, process 1 resumes and finishes the 50 remaining
but since each process has only 1 work order and no way for 2 processes to have the same work order, it still shouldn’t cause an issue
b
if you still want one map/reduce deployment per record
then your assumption is fine unless two people try to queue the same record at the same time
m
correct, that is not possible unless somehow literally they click on it at the exact same second and NS’s built in “record has changed” functionality fails me
I have learned a lot about how the scheduling/queue/processing happens in NS from this convo, thank you a lot!