AFAIK the getInput data is good and getting single/right number of SOs, also this seems to be happening when there is multiple Map Reduce of the same time running.
What I think is every time the Map Reduce Yields when it start again then is when the duplicate happens, but not luck reproducing that is a very very inconsistent problem...