Does JabRef have any file size or record limitations?

astronomertom · June 30, 2023, 5:33pm

I have a BibTex file originally generated in BibDesk that I’ve converted for JabRef, changing some tags, etc. The file has almost 12,000 bibliography records and is about 20MB in size.

However, the resulting file always fails to load into JabRef, failing at the same file line with an ‘unexpected EOF’ error. Cleanups of non-printing characters, non-ascii characters have the same results.

Is there some type of limit on file size, number of records, size of fields (I have some very long ‘annote’ entries)?

Thanks,
Tom

ThiloteE · June 30, 2023, 7:55pm

Not that I know of, but currently, huge libraries will (unfortunately) eventually face severe performance problems, as some users have reported. I believe, 12, 000 records should still be within the manageable though.

With regard to the EOF error it is more likely that there is indeed a faulty line in your library instead of having reached the max file size. It would be immensely helpful, if you could post the full entry that holds the line, which triggers the error. You also can try to remove that line completely and see if you can import the rest.

If there are multiple errors in the file, following the method of halfsplitting usually yields results very fast.

Also, as a general advise, please make sure to keep a backup of your library, before you use cleanup functions and advanced features of JabRef . Depending on what you did exactly, some of this stuff is hard to reverse automatically and may require manual intervention. You may run into pitfalls and only realize in a far distant future.

Siedlerchr · July 1, 2023, 1:45pm

Have you checked the encoding? Maybe something wrong with the encoding, JabRef uses UTF8 where possible. We recently fixed some issues regarding this

astronomertom · July 1, 2023, 6:04pm

I’m using the python bibtexparser to map tags from BibDesk to JabRef. I think I’ve successfully identified mutli-byte unicode characters and converted them to LaTeX commands and output as UTF-8. Grepping the resulting BibTeX file for ‘unprintable’ characters seem to return clean.

Viewing the file at the EOF line reported shows it as actual end of the file, but JabRef reports about 400 fewer entries than the original input files and processed by bibtexparser.

Thanks for the info. I’ll take a crack at splitting the file to see if that provides more insight.

Tom

ThiloteE · July 17, 2023, 6:17pm

Thinking about this a little more, I can recommend Meld, with which you can compare both files side by side. 400 entries is a lot and while JabRef already has the tools to compare and fix this, it may be more convenient to use a programm that is specialized to compare multiple files.

astronomertom · July 18, 2023, 11:12pm

I had tried meld, but it is only effective if some of my preprocessing does not alter the order of the entries.

However, a new version of bibtexparser recently came out and it handled the syntax problems more robustly and provided sufficient info for me to track down the problem.

My next big challenge is to restore the links from the BibTeX record to the PDF files in my archive.

Thanks for your assistance.
Tom

Siedlerchr · July 19, 2023, 6:03pm

Hi,

for restoring the file links: You can try to use JabRefs Find unlinked files etc

Topic		Replies	Views
Max width entries and file Help	0	743	August 16, 2016
V3.8.2 (x64 windows) : problem saving large .BIB libraries Help	13	2517	March 15, 2017
How much entries in your bib file? Help	8	2594	January 2, 2017
Importing library from Refrence Manager Help	0	452	January 12, 2021
JabRef 3.7 web fetching limitations, and added steps to do simple things Help	3	1542	November 22, 2016

Does JabRef have any file size or record limitations?

Related topics