I'm analyzing best approaches to handle duplicate ...
# suitescript
e
I'm analyzing best approaches to handle duplicate data. A little bit of context, we are receiving CSV data daily but the CSV could contain repeated data that was already processed in the previous days. The final result are sales orders/invoices created in NetSuite. How do you typically handle/identify duplicate rows? I'm thinking to create some custom records like this: • Parent Record (Job/batch/import record) to work as a container • Child Record (individual batch or group record) to have history of the data processed (here I'm grouping the sales orders/invoices by date/customer) So I'm curious how others are implemented mechanisms to make sure you are not reprocessing duplicate data and create like a unique key or something like that. Note: There are some map/reduces involved.
e
Use external IDs where possible.
e
Let me read a little bit more about it. Is the external ID something that should come with the CSV from the source of the data? or is it something I create in the script?
a
external id should be the primary key for the data source
it has enforced uniqueness on the NS side so if it tries to duplicate it will fail
e
It doesn't matter much as long as you're consistent. Also note, once an external id is set on a record, it can't be removed.
šŸ‘ 1
a
you can also potentially concatenate a few fields to ensure uniquness too if all you have from the source system is a simple number
so if they provide you with just an order num which increments 123, 124, 125 you can make the externalid be something like sourceSystem + _ + tranType + 123
shopify_salesorder_123
or w/e makes sense in your situation
e
Hey this is really useful guys! is there anything I should have in mind related to the fat that an external id can't not be removed? I'm assuming it cannot be edited either
e
They can be updated, (CSV import) you just can't clear it
e
Great! thanks for the clarification and your insights @ehcanadian @Anthony OConnor, I'm gonna dive deeper on this.
šŸ‘ 1
c
you can run server-side scripting w/ CSV imports, so you could also write some dupe detection logic and bail before it submits if its a dupe.
e
Hey @creece, just to clarify, I'm not using the native CSV Import tool here, it is an in-house importation integration using suitescript.
c
does the in-house solution group the data at all? sounds like that would solve it too
make a pass over the data to get unique data and then process.. i'd solve it at the lowest level possible.
e
We are getting rows of sales and we are grouping the data in different ways (there are some settings that allow by category, service, etc) but the data does not come grouped, however I'm checking that we are recieving an ID that could be useful, probably there are more things to take into account here but my main doubt was how others were implementing mechanisms to detect duplicates. In my case I'm getting sales orders lines and then grouping in different ways based on settings.
e
Belt and suspenders, filter the data before processing it, and setting an external id to hopefully help with weird edge cases
šŸ’Æ 1
āœ”ļø 1
b
you need to find some unique column in your csv to store in netsuite in a field so you can do a search/query to see if the transaction already exists
šŸ‘ 1
its also what you would need to do if you want to support updates, so now is probably the time to look into adding that
šŸ‘ 2
šŸ’Æ 1