Smart log while merging duplicated entries or better ways of citing two bibtex files?

(Qi) #1

I would like to merge two bibtex files which have a lot of overlaps and duplicated entries. Since these two bibtex database files are for different chapters of my dissertation, what is the best way of merging these database files? I have all chapter content finished, just need to make sure the References cited in the whole dissertation doesn’t duplicate in the end.

In my mind, I have the following workarounds/approaches:

  1. Merge two files while always stick to the bibtex key of file A using JabRef. After merging, give me a log file on where bibtex keys were changed so that I can update my chapter citations accordingly. For this approach, I am sure if I can easily find this log file to track the changes in JabRef.

  2. Use the two files separately for the two chapters or cite both bibtex files for all my dissertation content, so that I don’t need to change anything in the citations in my writing. At the end of the References section of my dissertation, all duplicated entries should be merged automatically–that is no duplicated info in the References section. Is this possible in general? I know this approach might not be related to JabRef functions, but I appreciate if anyone knows bibtex better than me to give a clue.



For BibTeX, two entries with different keys are completely different entries (while their contents may be identical). So your 2nd approach would involve looking by yourself for the duplicated entries in the PDF file. The package showkeys may help you in identifying the entry keys.

My point of view: since you have many duplicates, I would go for the approach 1, i.e. merging the files and have a clean BibTeX file first. Then compile and look at the missing entries in the log file, and, finally, update the keys in the TeX files.

(Christoph) #3

JabRef actually has a duplicate checker and entry merger in the tool menu or when you select two entries.
It’s the same functionality to update the entry with information from e.g. doi.

So you could just simply import the second library in your first one and then go for the duplicate checker (checks a lot of fields, not just the key).

However, I would sugget keeping a backup in case sth goes wrong after merging.

(Qi) #4

I am thinking to write a bash or something script to automate this process:

  1. Go through entries from bibtex file A to copy the bibtex info from the entry and look for if there is any duplicated item in bibtex file B.

  2. If there is a duplication, then copy the bib-key for the entry in A as string c and search for tex file F to replace c with the bib-key d for the corresponding entry in B. If there is no duplication, then add the new entry into B and generate a new bib-key for the entry, then copy the bib-key for the entry in A as string c and search for the tex file F to replace c with the bib-key d for the corresponding entry in B.

  3. Loop over all entries in A and finish all the replacement work for the tex file F.

For step 1 and 2, is there any way to use a script to ask JabRef to check duplication, add a new entry and spill out the correct strings for me? I feel this can serve many people’s needs, and can even be developed as a useful feature for JabRef :slight_smile: