I am maintaining a rather large .bib file for our group at work.
We use bibtex, due to old habits and also because all journals’ LaTeX styles I have seen use bibtex, typically with natbib).
For this reason, I have kept the encoding at ‘windows-1252’, to get a warning if one pastes for ex. author names with non-ascii characters.
However, I see that JabRef defaults to UTF-8, even for bibtex databases.
What is the reasoning behind this, given that bibtex is a 8-bit-only program?
Or, to put it differently, what am I missing when using an 8-bit encoding?
The only thing I am aware of is warnings about bad characters when I paste abstract, typically caused by “wrong” dashes or hyphens. But this could be better addressed by auto-correcting these (a feature I would really appreciate, by the way)…
We want to give users the choice to use LaTeX-encoding or to use UTF-8. Our analyis of users use of files is that since a few years, unicode encoding is more used than LaTeX-encoding. Morevoer, there is no world-wide dominant encoding. For instance, windows-1252 is western Europeon, but cannot be used in Greece for instance. Therefore, we default to UTF-8.
Replacing all unicode characters with LaTeX equivalents is possible with our clean up functionality: Cleanup entries | JabRef. You see at the screenshot the last entry for “field formatters”? “All-text-fields”. But you need to Choose “Unicode to LaTeX”.
One can have this conversion automatically since about 10 years:
Documentation at Save actions | JabRef - We very much welcome contributions to that page to make it more discoverable and more user-centered. Currently, it is more a draftish page.
Thanks a lot for the quick and detailed answer.
I really have to have a look at the field formatters - should have noticed them long time ago…
Small problem: in both my .bib files, the “Enable field formatters” is currently disabled. When I enable it and click “Apply”, Jabref shows the file has changed - but when I save it, there is no change in the file and the properties show that “Enable field formatters” is disabled again.
Am I missing something, or is this a bug? (I am on JabRef 5.15 on Windows)
We did not advertise them properly . You know, first the documentation right. And then there come in other feature requests and that gets forgotten…
I just realize that JabRef 5.15 is more than a year old…
I tried it here with a minimal .bib file - and it worked (with both 5.15 and the latest dev version). Would you mind sharing your .bib file with me so that I can investigate what’s happening?
It is still the latest version, as far as I can see. I have the version that winget calls 5.15.60000; its ‘About’ window shows:
JabRef 5.15--2024-07-10--1eb3493
Windows 10 10.0 amd64
Java 21.0.2
JavaFX 22.0.1+7
Winget also shows version 6.0-alpha.60000, but that sounds scary…
I don’t think the .bib files are to blame. I tried with a fresh .bib file with a single entry taken from Wikipedia and it still does not work…
Maybe I have some esoteric combination of Preferences, or something like that?
When I save it like this, then the change is not saved and formatters remain deactivated
When I delete one entry, then it is saved correctly
After that, I can add new entries and it works
So there seems to be something weird about the initial selection…
Side note: is “Replace Unicode ligatures” (which is not mentioned in the documentation, but is included in the default list) covered by “Unicode to LaTeX”?
Thank you for the feedback. Investigation on our side takes time. I put it on my TODO list and I try to come back to this this month.
I don’t know about the versions distributed by WinGet. We offer “instant” builds at https://builds.jabref.org/main/. While I always use that version to work with my .bib file, we have the communictation trend to say: use the stable version
We have a high number of pull requests - and some need feed back by power users.
Please open an issue at GitHub · Where software is built so that we can track it in our normal issue handling. I think, your steps are enough to post on the issue. It would be good to also have the error also as screenshot. That way, it is consumable by freshman and we can label it as “good second issue”.
Regarding the difference of “replace unicode ligatures” and the unicode-to-latex should be a different question. Someone needs to go into JabRef’s code and update the documentation accordingly then.