you get automagic load balancing across your processors/queues, time limits and better performance and efficiency (and cleaner to write the logic)
s
stalbert
02/07/2019, 1:01 AM
I must respectfully disagree on that last point. MR logic is invariably more complex in our use cases.
stalbert
02/07/2019, 1:03 AM
(noting that our logic for scheduled script is often a single elegant chain of a few methods on a lazy sequence).
c
creece
02/07/2019, 1:31 AM
so you're saying a 1.0 scheduled script is cleaner to write than a 2.0 map/reduce script w/ load balancing?
s
stalbert
02/07/2019, 2:16 PM
I'm saying the actual code is much simpler (imho) in the 1.0 script. If parallel processing isn't needed, I don't see a reason to add to the multi-stage complexity and other shortcomings of the MR script. As expected on any platform, the single threaded nature of a normal scheduled script is also generally easier to reason about and debug as well as easier to write than multithreaded code, even if NS provides us a framework for multithreading.
j
jkabot
02/07/2019, 3:56 PM
If your work does not benefit from being split up into independent chunks, then there is no advantage in using the Map/Reduce architecture.
For example, I have a job that runs a few times a day. It runs one search, computes some results, and makes one http request. There are no independent chunks, so there is no parallelization and no load balancing.
A map/reduce script would just add unnecessary overhead. A scheduled script is simpler.