Has anyone found a way to read a PDF and convert i...
# suitescript
c
Has anyone found a way to read a PDF and convert into either text, Javascript Object or JSON within a SuiteScript routine?
c
Crikey, no. PDFs are insane
Actually laughed that you asked
(Sorry)
There's things like https://www.npmjs.com/package/pdf2json but you'd have to essentially rewrite them and stub loads.
Happy to be proved wrong
s
Yeah, I have seen pdf-parse which is “cross-platform”, for whatever that’s worth. The readme links to several similar projects that may be worth checking out
c
Did you ever manage to get it to work?
s
I never tried. I’d have to agree that PDF’s are insane. There is a non-zero chance that it will fail to parse, even if you get it working.
c
Is this from netsuite to somewhere else or into netsuite? If from netsuite, there's the N/render module that has a few different rendering type options like PDF/HTML. Really depends on what you're doing.
c
I need to open a PDF that is stored in the filing cabinet and read the text and put in a custom record type.
c
Yeah that is gonna be a huge pain if its possible at all. I'd ask if whoever is sending you the PDFs has options for different formats. It may take the burden off of you.
b
you really dont want them to give you a pdf, that should be your first approach
pdf-parse will fail you since its designed to work on node js, however its approach is somewhat doable
internally it uses pdf.js, which is designed to work in the browser or on node
and those have a chance of working in suitescript
keep in mind that things that do binary operations are barely workable in suitescript, expect to know how to use require Configuration and how to shim in missing globals
s
My honest opinion is you need an ocr solution. PDF comes in too many varieties. Excel/ XML generated or scanned. I'd almost always suggest a third party solution (there are many out there) that will do this ocr, and then the OCR fed to the ERP. How many templates (varieties), what varieties ^ computer generated or scanned will influence what solution you need. The third party solution can have machine or human validation, that depends on budget.
The only exception is probably electronic invoicing. I think that could be covered by SS...?
280 Views