This site was built to compliment the existing NetSuite Professionals Slack Community by adding a means for members to easily find answers to common questions and to create a searchable database of those questions and answers.

NetSuite Professionals

i use file.lines.iterator() to iterate through all the lines of a csv file with a size of 10 MB (~50,000) lines in my map/reduce script, but it is significantly slow.. ie taking more than 20 minutes at least, is it expected? or does that sound like specific to Netsuite account?

How many lines would you expect to process per second?

i don't have much experience with netsuite, but 1 ms per line sounds fair

i would only expect millisecond level performance if your script doesnt use any suitescript apis

Are you seeing slowness in processing each line, or is it a long delay between `getInputData` and `map`?

the job was created in ~2:45 PM but it is still currently running, i haven't write log from within the lines.iterator().each function to visualize the elapsed time i can try

Yes, it appears that getInputData() is serial - as in it gathers up all your results before it sends anything to `map()` That is, map() runs in parallel but not until getInputData() has finished entirely. Would love to hear advice to the contrary but that's been my empirical observcations.

that can be a pain, and indeed I've found sometimes a scheduled script outperform a MR because it can start making progress immediately.

does that mean due to the large amount of data returning from `getInputData` stage, there is a long delay between `getInputData` and `map`?

Yes - I had a similar problem where I had a large search - the search itself would eventually time out when run in `getInputData` in a MR script so it never even reached `map`.

I expected MR scripts to feed `map()` data in parallel and _incrementally_ while your data was being returned from `getInputData()` but it doesn't seem to operate in that sort of 'streaming' fashion.

In my case, it seemed clear that even though my `getInputData` returned a search _reference_, behind the scenes it must have been executing the search and trying to load ALL the results from the entire search. Perhaps the same is happening with    ALL the lines from the file you're iterating?

i simply -- 1. have a global variable as an array to accumulate the records 2. load the file and iterate each line to push records after converting into JSON into the array 3. return the array at the end

i'm surprised map/reduce doesn't work well for that size, since it's only 50,000 lines, what i have found is people are satisfied with the results when they process &gt; 100,000 records within 1 map/reduce instance

and your processing is probably not helping

hmm the file size itself is only 10 MB, even though it has 50,000 lines

<https://system.netsuite.com/app/help/helpcenter.nl?fid=section_4412447940.html|return the file object> in getInputData and do your processing in map

you are basically generating a large object in memory and the garbage collector is probably panicking