MEDLINE cleanup ops: ProQuest

This is another example of MEDLINE data (continuing from the discussion about importing PubMed records) demonstrating why the data need to be moved and transformed to be standardized in a bibtex library. (I am collecting notes, not making a feature request)

MEDLINE data in RIS format from ProQuest look like this:

AN  - 2547535916; 34192973
KW  - bone remodeling
KW  - locking compression plate
KW  - mechanobiology of bone healing
KW  - stress shielding
KW  - Index Medicus
KW  - Biomechanical Phenomena
KW  - Analysis of Variance
KW  - Bone Screws
KW  - Bone Plates
KW  - Tibia -- surgery
KW  - Fracture Fixation, Internal -- methods
UR  - … …
  • MEDLINE PMID maps to AN and is excluded entirely from Bibtex (in ProQuest).
    • PMID is identifiable by position (second), length, and co-occurrence with DB - MEDLINE®
  • MEDLINE OT, MH, and possibly other fields map to KW in ProQuest.
    • Lowercase and sentence case signify keywords originating as OT in MEDLINE.
    • Title-case signifies keywords originating as MH in MEDLINE.
    • Two dashes signify a subheading (comparable to MH - Tibia/surgery in PubMed). Unlike PubMed, ProQuest keeps heading-subheading combinations separate
    • MEDLINE major topics are missing from the RIS data. The PubMed record might include:
      • MH - *Tibia/surgery or
      • MH - Tibia/*surgery
  • ProQuest records don’t include an indication of their source, other than what is apparent from the URL.

Complete MEDLINE data are available from PubMed, but there are still reasons to use other platforms, such as more sophisticated query language and the ability to search multiple databases at once.

Personally, I would make at least two changes to the RIS records before importing, to standardize MEDLINE data across sources.

  1. Move the PMID data from AN to PMID (and include it in the import)
  2. Format headings and subheadings to match regardless of the source (e.g., replace -- in subheadings with / or vice versa)
1 Like