How to read bib data for/from a PDF file

Entirely new to bibliographies and references. Wanted to try Jabref but not much help info that speaks in 5.2 terms - at least as far as I can tell.
I have a PDF on my hard drive. All I want to do is get Jabref to create an entry in a library with the reference data for the PDF. I need some help in this matter.

I have a few basic questions as well:

  1. Does the reference data reside in the PDF itself, or does Jabref get it from the web?
  2. Is it possible to have Jabref write the data into the PDF file and save the file for later extraction?
    Many thanks.
    Lou

Hi,

Thanks for trying JabRef! JabRef uses and stores all data in the bibtex file format. It’s just a simple plain text file. PDFs are just linked to the entry. You can find more info in our help page

  1. You can try to import a PDF, JabRef will search for XMP metadata. If not found it will do a heuristic search on the first page of the pdf for some metadata. However, that’s usually not that accurate.
    It’s usually more accurate to create a new entry by using an identifier like DOI or arXiv, ISBN,… JabRef will fetch the data for the identifier then.

  2. Yes! Tools → Write XMP metadata to PDF.

Thanks for choosing JabRef!
Have fun!
Best regards

Interesting response. Thank you.
Pls let me know if the following is wrong altogether.
I imported my PDF into Docear. It came up with Bibtex data, I assume from the web. JabRef creates an entry but doesn’t populate it when I try to import it, so I must be doing something wrong. How do I get the DOI, arXiv or ISBN ID’s?

The import produced an interesting problem: the PDF’s file name was made blank or perhaps became a dash. Is that normal, and if so, what does it mean?

The PDF is on a my hard drive in a folder. I just dragged it to the list of entries. With the disappearing filename, the term ‘link’ is peculiar. I wonder if the PDF is moved or copied?

Lou.

Hi,

there recently have been some bug fixes regarding the import, I suggest you try the latest development version index - powered by h5ai v0.29.0 (https://larsjung.de/h5ai/)

I imported my PDF into Docear. It came up with Bibtex data, I assume from the web.
If you still have the bibtex data you can simply copy and paste it into JabRef.

please see also the docs https://docs.jabref.org/collect/findunlinkedfiles

The PDF is on a my hard drive in a folder. I just dragged it to the list of entries. With the disappearing filename, the term ‘link’ is peculiar. I wonder if the PDF is moved or copied?

This depends on the key you pressed.
https://docs.jabref.org/advanced/entryeditor#drag-and-drop-behavior-settings

I really suggest you read the section Getting started in the help page

Thanks for the new version.
Copying bib info is not the issue. Docear finds the bib info automatically. The point was to get JabRef to find the bib info on its own because only it can write the info into the PDF file.
The section you pointed me to contains this:

For one PDF file

The simplest way to create a new entry based on a PDF file is to drag & drop the file onto the table of entries (between two existing entries). JabRef will then analyze the PDF and create a new entry.

Note: no mention of drag and drop keys and this is the only software that I have ever seen have this ‘feature’. And why does Move or Copy notes change with where I position the dragging PDF icon on the screen? This requires a sudden critical decision for a new user.
The help info on import seems to require two existing 'entry’s in order for the import to work. But how can I tell if it worked at all? The two entries I created are blank. A new blank one created by the import. If there were more blank ones, which is mine? And where did the file go if I didn’t specify any directories. For a quick run-up, its a lot of unknowns.

The Getting Started section states what to do, but doesn’t explain very much. Ex: Library contain references. Does it also contain files? What is in the database that is often referred to in other sections?
The incomplete sections leave little confidence that the help files are current.
Sending me to different sections is fine, but don’t assume that I haven’t read them. In fact, I have spent hours doing so. I just may not understand them.

Regards,
Lou.

That doesn’t necessarily mean JabRef will manage to find the same info. If all fields are blank it is likely that JabRef, for one reason or another, couldn’t read the data it needed from the PDF. Using JabRef 5.3–2021-04-19 plain drag-and-drop mostly works for the pdf files I find through google scholar, but there are ongoing efforts to make the pdf import better.
If you are only getting blank entries it might be better to get the DOI/ISBN from some other source than JabRef. I have never used Docear so I don’t know if you can export what it finds to JabRef somehow.
If there are only a few entries the web-search feature might work for you.

Most likely it means that JabRef is trying to autogenerate a filename based on the empty entry. I believe the default auto-generated filename is the BibTeX key and a shortened version of the title separated by a dash. If there’s no BibTeX key and no title → -.pdf. This is the ‘Linked file name conventions’ in Preferences → Linked Files.

A regular drag-n-drop only links in my case, i.e. the file name does not change (JabRef 5.3 Mac OS X). Does this happen for all .pdf files? If it happens for only one file or a few, would you mind posting the names/paths of those files (to rule out a bug)?

  1. I have changed to v5.3. Things are much better. Glad to see it.

  2. The change in file name results in exactly “-,pdf”.Not something you want to see happen in my all-PDF folder (not part of Docear or JabRef).

  3. My real objective is to create a symlink to a PDF that has been altered to hold its own bib data. I want to see if Docear can read that bib data through the symlink. This is a suggested operating method . (See Sustainable Research… Part II by Saul Albert, later on the web page.) However, PDFs do not usually contain their bib data, and only JabRef can write it into the PDF. I’m still trying that, but I don’t know where JabRef puts the altered PDF. I imported the PDF with a drag and drop method with the Copy mode. The original PDF still have no bib data so JabRef didn’t write it there. This version of JabRef is portable and resides on my flash drive.

  4. Where is the ‘current directory’ referred to in the page entitled ‘Entry Editor’ section ‘Drag and Drop Behaviour settings’?
    Regards,
    Lou.

This should only happen if JabRef fails to read information from the pdf (the entry in the main table is empty after import).

If you just want to try the “workflow”, I’d suggest using the web search feature in the lower-left corner and download a couple of entries to experiment with.
Skärmavbild 2021-04-23 kl. 14.58.59

Regarding location, see your other thread.

No thanks on the workflow. The goal is stated in item 3 of the last post.

Ahhh. I found that Docear actually does the ‘write XMP data’ function so I won’t need Jabref for now.
Thanks for your help. It was a hoot? :smiley: