I’ve got a very big PDF file full of bibliographies that I copied from many different papers. Is it somehow possible to automatically create a bibtex file or at least batch-find all JOI’s or other identifiers of all sources inside this PDF?
Of course this PDF doesn’t only contain published sources with JOI’s, but having help in extracting the main portion of the sources would already help a lot.
I originally thought Jabrefs Plan References Parser would be able to do something similar, but it seems to be really unreliable for my use case.
In theory, what you want is to import the pdf with all the references that are included in the pdf.
Import into current library > Filetype:
PDFcontent. If it does not work, which I fear, you can simply try to drag and drop the file into JabRef, which will use our Grobid feature (Extract information from PDF import). Grobid uses AI technologies that are inherently propabilistic, so at one point, you WILL end up with hallucinations. Doing it that way will only be a first lead and you will have to crosscheck or update the information with correct data from the net.
Have a look at Menu "Update references" too, which will teach you how to update references.
JabRef cannot update references in bulk yet and I am not sure how Grobid handles multiple references. I think it ignores references in the text and bibliography and only parses the “main” reference of the pdf instead. With main reference, I mean author, editor, journal, title etc. of the pdf.
I tried it, but this only extracts the first reference it can find. Still, thanks!