Parsing references from the PDF

koppor · April 8, 2024, 9:32am

JabRef offers the feature to extract references from the bibliography of a PDF using GROBID’s capabilities. This works OKish, but has the major drawbacks that data is sent from JabRef to a service hosted by the JabRef developers. When conducting paper reviews, one does not want to share the PDFs to other third parties.

Therefore, we included a possibility for offline parsing in JabRef. It currently works well for IEEE papers, but not for others.

You can try it out if it works for you, too: Download the latest development version from https://builds.jabref.org/main/.

Ensure that online services are disabled

You need to scroll down in the “Web search” preferences.

Start parser

Have the PDF attached to an entry.
Right-click on the entry in the entry table.
Select the item “Extract references from file (offline)”

Ensure that “offline” is written. If there is “online” written, you have GROBID services enabled.

There will be a popup with all parsed references

First, click “Select all new entries”, then deselect “Download linked online files”. Finally, press “Import entries” to add the entries to your library.

Result

The entries are now imported into the main table.

The original entry has links to the cited entries in the “Other fields” section:

Links

You find the PDF used for testing at jabref/src/test/resources/org/jabref/logic/importer/fileformat/tua3i2refpage.pdf at main · JabRef/jabref · GitHub.

A discussion of older functionality is done at Import from Reference text.

Outlook

We are happy to hear feedback on this functionality and how that it works for your use-case, too. If it does not, please search for an open-access PDF - or create a new example PDF for us. Then, we can work on parsing your variant of references similar to the pull request 11156.

gittibit · February 7, 2025, 7:46am

Hey there - it seems the grobid server from JabRef is down currently. Are there any plans to continue this service? this is really what is missing for me to fully embrace jabref

Topic		Replies	Views
Creating Bibtex or DOI list from bibliography Features	4	1048	March 12, 2024
Table of recognised sources from pdf entry Features	8	664	April 8, 2024
Extract information from PDF import Features	14	1997	December 22, 2021
Import from Reference text Help	2	490	April 8, 2024
PDF metadata extraction quality Features	8	115	April 11, 2025