Disambiguating author names

douglasrizzo · December 29, 2018, 7:01am

Hi. I came here because I was previously interested in finding a way to disambiguate author names in my database. For example, if I have the following names in my database:

John Smith
J. Smith
John R. Smith
J. Roosevelt Smith

I should be able to select, from some kind of list, which names belong to the same person and my disambiguation tool should replace all occurrences of those names in by bib file, keeping only the name in its most complete version (e.g. John Roosevelt Smith). Maybe there are instances of J. Smith that refer to a Janet Smith, for example, so I should be able to select in which entries the name should be changed and in which ones it should be kept.

I find this important because bib files found in the wild rarely get author names right, be it because of accents, abbreviations or missing middle names. There are styles which depend on an author name being written in the exact same way in every entry in order to apply some formatting rule, so I believe this feature/problem has relevance.

I naively tried creating a Python script to solve this issue automatically, but now I know this is impossible to solve without human intervention. In it, I tried using a combination of the initials of a person’s name and their last name (without accents) to detect multiple different names which might belong to the same individual.

I believe a feature like this would be interesting in JabRef. It already shows concern in normalizing name fields during cleanups and it also has a powerful feature to fetch author names from CrossRef via an entry’s DOI, which could be used as an additional source of information when deciding an author’s full name representation.

I also found another post related to finding author information online which may be helpful in tackling this.

Right now, I am busy with my own research, but I’ve developed Java applications and libraries before, so I might be able to help in the future.

Siedlerchr · January 3, 2019, 11:07am

Hi and Happy new Year to you!

Thanks for your interest in helping to improve JabRef!
Check out How to contribute and how to setup a local workspace

Topic		Replies	Views
Author's name and the displayed version in column Help	3	780	July 12, 2017
More control on the Duplicate Finder Features	9	3881	March 23, 2021
Display in entry columns author and title Help	5	1613	May 2, 2017
Cleaning up authors with/without middle initial Help	1	930	July 13, 2020
Use clean up to abbreviate first names? Help	1	1066	January 6, 2021

Disambiguating author names

Related topics