Variable filename format pattern (and file directory pattern)

I would like the filename format pattern to produce different results for entrytype article or entrytype book.
In the citation key generator JabRef contains the option to deviate from the standard depending on the entrytype. In Filename format pattern and File directory pattern there is only one pattern definition.

The export filter describes a conditional option, but this conditional option only tests if a field has content or not, it does not allow to define a different format depending on the contents of a field.
I would like to have a conditional option that checks the contents of one field (entryType) to decide what filename format to use.

Any Ideas?

Hi,

the only idea I have would be to use the regex modifier to test for an entry type…
But no idea if this works for your case.
Otherwise it’s not possible. The filename and file directory pattern are already complex enough

That were my thoughts as well and it leads to the next question: How to get fields (with modifiers and regexp) in the replace part of a regexp

For the file directory, we have most document types stored in folder named as the entrype. However, for copies of individual publications the volume is to large to simply store all in one folder name “Article” or “Journal”. For these entries we make separate folder named:
./Journal_[first letter of journal title]/[journal title]/[year]/
So I need a complex regexp based on the entry type
[EntryType:regexp(“search”,“replace”)
with as “search” “Article|Journal” and as “replace” “Journal_[journal:latex_to_unicode:regex(‘[{}]’,‘’):truncate1]/[journal:latex_to_unicode:regex(‘[{}]’,‘’)]”

I tried with quotes around the entire expression: fails
I tried with quotes around the literals and the JabRef fields unquoted in between: fails

Does anybody have an idea how to get a mixture of literals and JabRef fields as replacement values in a regexp and to complicate things, with further regex functions in this replacement term.

Even, if we were to assume your regex is otherwise correct (I have not tested), i think it cannot work because there are unescaped unbackslashed special characters. Have a look at this part of the JabRef docs: searching for strings with a special character.

Escaping does not solve the issue. If I use [EntryType:regex(“Article|Journal”,“[journal]”)]/[YEAR]
the file is stored in a directory that is literally named [journal] not in a directory that is named using the field value of [journal].

It seems as if JabRef regex cannot use field as placeholders within regex

I found a solution, it works, but is not elegant and not how it should be solved because it actually abuses the citationkey field

In the library I have created different definitions for the citation key

  • Article:
    [journal:unprotect_terms:regex("[{}]",""):truncate1]/[journal:unprotect_terms:regex("[{}]",""):regex(" ","_")]/[YEAR]/___[journal:abbr]_[YEAR].comma._[volume][number:regex(".+", "\.po\.$0\.pc\.")].comma._[firstpage].hyphen.[lastpage]

  • Book:
    ___[booktitle:unprotect_terms:regex("[{}]",""):regex(" ","_")][booksubtitle:unprotect_terms:regex("[{}]",""):regex(" ","_"):regex(".+","_.hyphen._$0")][edition:regex(".+",".comma._$0_ed.")].comma._[YEAR]

  • inBook:
    ___[booktitle:unprotect_terms:regex("[{}]",""):regex(" ","_")][booksubtitle:unprotect_terms:regex("[{}]",""):regex(" ","_"):regex(".+","_.hyphen._$0")][edition:regex(".+",".comma._$0_ed.")][chapter:unprotect_terms:regex("[{}]",""):regex(" ","_"):regex(".+",".comma._ch._$0")].comma._[YEAR].comma._[firstpage].hyphen.[lastpage]

These citation keys are used to define the filename format and file directory using the below patterns

  • Filename format pattern:
    [citationkey:regex("^(.*)___(.*)$","$2"):regex(".comma.",","):regex(".hyphen.","-"):regex(".po.","\("):regex(".pc.","\)"):regex("_"," ")]

  • File directory pattern:
    [EntryType:regex("InBook","Book"):regex("Article","Journal")]/[citationkey:regex("^(.*)___(.*)$","$1")]

The citation keys consist of two parts: the directory part and the filename part, separated by three underscores.
This stores the entries attached files in directories with the entry name with subdirectories for the articles based on the journaltitle and publication year and no subdirectories for books (the entrytypes Book and inBook start the citationkey with three underscores).
Since the citationkey field apparently does not allow spaces and several other symbols, some placeholders are introduced in the citationkey, which are replaced with the actually preferred symbols in the filename and directory format patterns.

I have not found any other field in JabRef where I can use different definitions per entrytype and that can be cited in the file and directory format patterns.

Anybody?

1 Like

Respect. You are doing quite advanced stuff. Would need to do quite a bit of trial and error and some digging just to follow you.

Could you give a simple example with screenshot of

  • Biblatex source tab
  • relevant preferences (e.g. citationkey pattern)
  • screenshot of resulting file name and resulting file directory

?

Do you put all patterns for article, book, inbook etc. behind each other so that it is one very long pattern? Edit: Never mind. There are different sections for each entry type. :slight_smile: Got it!

Just some examples that hopefully clarify how it works:

  • Article example
    [journal:unprotect_terms:regex("[{}]",""):truncate1]/[journal:unprotect_terms:regex("[{}]",""):regex(" ","_")]/[YEAR]/___[journal:abbr]_[YEAR].comma._[volume][number:regex(".+", "\.po\.$0\.pc\.")].comma._[firstpage].hyphen.[lastpage]

@Article{M/Molecular_Aspects_of_Medicine/1999/___MAoM_1999.comma._20.po.12.pc…comma._1.hyphen.137,
author = {Julia A Hasler and Ronald Estabrook and Michael Murray and Irina Pikuleva and Michael Waterman and Jorge Capdevila and Vijakumar Holla and Christian Helvig and John R Falck and Geoffrey Farrell and Laurence S Kaminsky and Simon D Spivack and Eric Boitier and Philippe Beaune},
title = {Human cytochromes P450},
date = {1999-02},
issn = {0098-2997},
number = {1-2},
pages = {1–137},
volume = {20},
doi = {10.1016/s0098-2997(99)00005-9},
file = {:Journal/M/Molecular_Aspects_of_Medicine/1999/MAoM 1999, 20(12), 1-137.pdf:PDF},
journaltitle = {Molecular Aspects of Medicine},
keywords = {Cytochrome P450, Xenobiotics, Drug metabolism, Genetic polymorphisms, Steroid hormone synthesis, Fatty acid epoxidation, Eicosaniod metabolism, Disease, Chemical carcinogenesis, Cancer, Autoantibodies},
publisher = {Elsevier {BV}},
url = {https://www.sciencedirect.com/science/article/pii/S0098299799000059},
}

  • Book example
    ___[booktitle:unprotect_terms:regex("[{}]",""):regex(" ","_")][booksubtitle:unprotect_terms:regex("[{}]",""):regex(" ","_"):regex(".+","_.hyphen._$0")][edition:regex(".+",".comma._$0_ed.")].comma._[YEAR]

@Book{__Drug_Metabolism.hyphen._Current_Concepts.comma._2006,
booktitle = {Drug Metabolism},
editor = {Ionescu, Corina and Caira, Mino R.},
publisher = {SPRINGER NATURE},
booksubtitle = {Current Concepts},
isbn = {1402041411},
title = {Drug Metabolism: Current Concepts},
date = {2006-02},
doi = {10.1007/1-4020-4142-X},
ean = {9781402041419},
file = {:Book/Drug Metabolism - Current Concepts, 2006.pdf:PDF},
pagetotal = {422},
url = {https://www.ebook.de/de/product/5305610/drug_metabolism_current_concepts.html},
}

  • inBook example
    ___[booktitle:unprotect_terms:regex("[{}]",""):regex(" ","_")][booksubtitle:unprotect_terms:regex("[{}]",""):regex(" ","_"):regex(".+","_.hyphen._$0")][edition:regex(".+",".comma._$0_ed.")][chapter:unprotect_terms:regex("[{}]",""):regex(" ","_"):regex(".+",".comma._ch._$0")].comma._[YEAR].comma._[firstpage].hyphen.[lastpage]

InBook{___The_Organic_Chemistry_of_Drug_Design_and_Drug_Action.comma._ch._7_Drug_Metabolism.comma._1992.comma._277.hyphen.351,
author = {Richard B. Silverman},
booktitle = {The Organic Chemistry of Drug Design and Drug Action},
chapter = {7: Drug Metabolism},
editor = {Richard B. Silverman},
pages = {277–351},
publisher = {Academic Press},
year = {1992},
address = {San Diego},
isbn = {978-0-12-643730-0},
title = {CHAPTER 7: - Drug Metabolism},
doi = {10.1016/B978-0-08-057123-2.50011-9},
file = {:Book/The Organic Chemistry of Drug Design and Drug Action, ch. 7 Drug Metabolism, 1992, 277-351.pdf:PDF},
url = {https://www.sciencedirect.com/science/article/pii/B9780080571232500119},
}

and for conversion to directory structure and filename:

  • directory
    [EntryType:regex("InBook","Book"):regex("Article","Journal")]/[citationkey:regex("^(.*)___(.*)$","$1")]
  • filename
    [citationkey:regex("^(.*)___(.*)$","$2"):regex(".comma.",","):regex(".hyphen.","-"):regex(".po.","\("):regex(".pc.","\)"):regex("_"," ")]
    As you can see the path in the file is derived from the entryKey (with replacing Article to Journal and InBook to Book) and the citationkey.

For those that are less experienced with regular expressions the term [citationkey:regex("^(.*)___(.*)$","$1")] works as follows:
It searches in the citationkey field for anything that matches the definition .*: any number of characters followed by ___ three underscores followed by .*: any number of characters.
By default regex stores the matching part of the term in a buffer that can be cited in the replacement term using $0. In case you want to re-use only a part of the match, you have to define subparts using parentheses. In my example ^(.*)___(.*)$ I have defined two subparts (.*) before the three underscores and (.*) after the three underscores. I can cite these in the replacement term with $1 and $2 respectively. $1 I use in the file directory pattern and $2 I use in the filename pattern.

The problem that arises by using the citationkey is that the citation key does not allow to use any character you would like. I have not checked the manual to find out which characters are forbidden in the citationkey, I just used trial and error to find out. To circumvent this issue I used a trick that was used in building the pre-SQL, 7-bit ASCII textfile databases where many characters were not available (7-bit ASCII) and quite a few were reserved for storing instructions to the interpreters of the files. In those systems we used descriptive replacements between two dots for any symbol that was not available or reserved.
So, the citation key contains .comma. and the filename pattern replaces it with , similar for .hyphen., ...

@ggieling JabRef has a list of inbuilt characters that are not allowed, but you can adjust them Customize the citation key generator - JabRef

For generating the actual file names, JabRef has a list of illegal characters that will be automatically replaced by underscore

Interesting,

I have left the Remove the following characters setting in Citation Key patterns to the default value -`ʹ:!;?^+ This explains the removal of the hyphen, but not the comma and the space. I checked your code extract and in case the list of ILLEGAL_CHARS is an ascii number list, the space (ascii 32) and comma (ascii 44) are both not listed.
Is there perhaps an additional list of characters that are not allowed?

Just for completeness, the setting Replace in the citation key patterns is also in its default, empty, state.