Best practices regarding one big .bib file for all papers

Hello everyone,

This page: Getting started - JabRef provides a youtube link to a video by “JoshTheEngineer” and this person indicates that he has a common folder for different papers and has multiple .bib files depending on the topic being written about.

I would, however, like to have a single big common .bib file sourced from different directories that have multiple .tex files on different topics. Different papers would then source a selected subset of this central large .bib file.

One downside of this is if there are entries for different papers that share the same bibkey. Suppose a paper written in 2022 sources this central .bib file and cites a bibkey : abc2021, say. Later on, in 2023, I write a different paper which also requires the same bibkey abc2021 but this is an entirely different ciated paper. Now, on entering this entry into Jabref, it would warn me that there are duplicates that require me to resolve. Now, if I change the earlier original bibkey abc2021 to abc2021x, say. Then, I would no longer be able to compile my earlier .tex file because that would refer to a nonexistent bibkey.

Are there any best practices surrounding how to avoid this situation?

My current solution is to generate a custom bibkey for each entry in this single big .bib file and create a custom key which is the following combination:

[authorsAlpha:lower][shortyear][title:abbr][journal:abbr][volume][number][firstpage][lastpage]

(The longish nature of this bibkey is not a problem for I have an autocompletion engine in my latex editor, vim, that makes it easy to cite such entries by typing in the first few characters/authors names or title)

One would imagine that the probability is vanishingly small that there would be two different papers that give rise to a bibkey clash with the above customization

Am I thinking in the right direction or is this an overkill? Is there any other better way to avoid such clashes?

Thanks.

Maybe @AEgit or @mlep can share some ideas on how they handle it

For duplicate keys you can check the preferences, e…g standard would be to add a letter (eg. abc2021a to the new entry

For sure, it is more than convenient to have no duplicate keys.
For your older .tex, you could keep using your old bib files as is. And start using a single big .bib file for your new .tex files. In this new file, you will have to clean the duplicated entries first (JabRef will be very helpful), and then alter all duplicated keys (or homogenize the key pattern). Whatever the pattern you choose, it has to enforce that no duplicated keys are created. Your long custom keep pattern is a good start. Allowing the add of a suffix “a,b,c”, as suggested by @Siedlerchr will help make it work.

1 Like

I think @Siedlerchr and @mlep have already mentioned everything important. Generally I would always just keep a single JabRef database with unique citation keys - this avoids the issues you mentioned. As for the actual keys - if the longish bibkey is of no problem to you, your solution sounds like a good idea. I use a slightly different approach, which - in rare cases can result in non-unique keys (though fortunately JabRef tells me about this and therefore I can easily rectify the situation if it appears). I use the default [auth][year] key pattern and then add _someTitletext if it is an author that appears multiple times in my database, where someTitletext represents a word from the title (usually one that characterizes the whole paper).

This is my master JabRef database, which is continuously being updated. When I write a paper, I use this database, but once the paper is written/accepted I create a copy of this database and store it together with the draft. This way I have a “frozen” version of the database - in case I should make any major changes to my master database in the future, this would not affect the respective draft since it can easily be linked to the frozen version. The frozen version is to be never touched again and stored together with the draft.

1 Like

Thank you for the different suggestions.