File linking functionality update?

Hi, folks! I am very much looking forward to see a new JabRef release with automated PDF fetchers. As of the commit master-8730d02, some of the functions do not work as I guess they are expected to. Since some of the situations are difficult to reproduce and the code base changes rapidly, I am reluctant to file bug reports but would first like to discuss the planned functionality as you see it.

In the mentioned commit, the “Automatically set file links for this entry” does not find files if the “Options|Preferences|File|User regular expression” is set to anything different from [bibtexkey] (e.g. it fails for **/[year][auth].*\.[extension]); the “Rename file” does not work if the “Options|Preferences|Import|File name format pattern” contains underscores (e.g. \year\auth_\firstpage), and new files get downloaded to the main file directory, not to the directory specified in the “File directory pattern” (the further “Rename” works as expected, but the dialog confusingly shows the old dir name, not the new one).

I would expect that “Automatically find fie links for this entry” searches my disk and finds all files matching the External file link search regexp; and that “Get fulltext” fetches and places a new file into a directory constructed from the “File directory pattern”, using the name derived from the “Filename format pattern”. I would also expect the “Move file” pop-up menu command to use the “File directory pattern”, and the “Rename” command to use the “Filename format pattern”.

It would also be beneficial, IMHO, to use the same file pattern syntax in BibTex key generator default pattern, in the File search regexp pattern, in the File name format pattern and in the File directory pattern.

This would allow to implement, e.g., the following tree of a bibliographic database automagically:

saulius@varanas papers/ $ tree bibliography.bib by-author/A | head
bibliography.bib
by-author/A
├── Aakeröy
│   └── 2010_Aakeröy_22.pdf
├── Abadi
│   └── 1991_Abadi_237.pdf
├── Abadjieva
│   ├── 1993_Abadjieva_35.pdf
│   └── 2003_Abadjieva_319.pdf
├── Abad-Zapatero

Here file names have the “[year][auth][firstpage]” pattern, and directories follow the “by-author/[substr(auth,1,1)]/[auth]/” pattern. This works very well for us since at least 2008. Other tree structures can be constructed if necessary.

While trying to implement such functionality in my private branch, I noticed that the JabRef uses two different syntaxes (\year vs. [year]) for the above-mentioned patterns, and that the interpretation of the patterns is scattered in at least three different places in the code:

./src/main/java/org/jabref/logic/layout/LayoutHelper.java:
private void doBracketedField(final int field)

./src/main/java/org/jabref/logic/bibtexkeypattern/BibtexKeyPatternUtil.java:
public static String makeLabel(BibEntry entry, String value, Character keywordDelimiter, BibDatabase database)

./src/main/java/org/jabref/logic/util/io/RegExpBasedFileFinder.java:
public static String expandBrackets(String bracketString, BibEntry entry, BibDatabase database, …

It seems to me that it would be easier for end users to use just one syntax in all the Preference fields, and to have processing of this syntax in just one place. This should also make support and documentation of JabRef easier. I would thus propose to have one helper class (called, for instance, BracketedExpressionExpander), that would take BibEntry instance as an argument to its constructor, and would have a public method generate strings (or another appropriate objects, such as Paths) for the file name patterns; thus for the pattern “**/[year][auth][firstpage].*\.[extension]”, a requested PDF file and an entry:

@Article{Berman2008,
Title = {The {Protein Data Bank}: a historical perspective},
Author = {Berman, Helen M.},
Journal = {Acta Crystallographica Section A},
Year = {2008},
Pages = {88–95},
Volume = {64}
}

it would generate a name search regexp “2008_Berman_88.*\.pdf”.

I would the suggest to use this class in all cases where expansion of bracketed expressions is need, namely for BibTexKey generation, for file search regexp generation, and for download file name and directory name generation.

I would not change the use of LayoutHelper, since, as I understand, it is used for generation of citations according to the journals’ requirements, and changing it might break existing citation style files, which is to bee avoided at all costs.

I would be happy to implement such class, with test coverage :), to integrate it into the file finder and downloader codes, and to provide a pull request for this change. Since it grafts quite deep in to the current code base, however, and since it will be more than a few lines of code, I would like to hear you opinion about the proposed change, and about what would be the most suitable implementation for JabRef so that it can later be integrated into the master branch.

Sincerely yours,
Saulius

3 Likes

Hello Saulius,

thank you very much for your detailed report and your willing to contribute!
Your changes sound reasonable to me and I agree with you that the format pattern stuff is hard to understand (even I sometimes struggle with it :wink: ). (I’ve implemented the file directory pattern stuff a while ago, so in case of questions do not hesitate to ask for help or contact us/me).

We are looking forward to see a Pull Request from you!

To make your and our life easier, please have a look at our contribution Guide:

Regarding the format details: We have currently this help entry which documents all kind of format :
http://help.jabref.org/en/CustomExports#the-layout-file-format

And the other aspect you noticed is for the BibTexKey Generation dialog:
http://help.jabref.org/en/BibtexKeyPatterns

We are happy that you would like to contribute!
Best Regards
Christoph

1 Like

Dear Christoph,

Thank you for your kind reply and for the links! I’ll read the contribution guidelines and follow them. I’ll start my work in the bracketed-expr-handler on my fork (https://github.com/sauliusg/jabref), and discuss with you when some code is ready.

You have also mentioned in Cleanup and file links than file renaming functionality was recently updated. Is it already in the master branch? I guess I need not to get in conflict with that change.

Regards,
Saulius

Hi,

All my changes are in the master branch. If you fork the current version, you are on the safe side.
And I really recommend creating s Wip PR, so after each push you see the results from the continuous integration server which runs all tests

Hi, Christoph,
hi everybody!

I have implemented functionality discussed in my “File linking functionality update?” note. All gradle tests pass, and the functionality works in my test installation. The recent ‘master’ (2017-09-25) is merged into the feature branch holding the suggested changes. Should I now file an issue before making a pull request, or may I add a pull request at once?

Regards,
Saulius

Hi Saulius!

That’s excellent news :smiley:

Just go ahead and create a pull request. There’s no mandatory requirement to have an issue as well. It would be best if you could link to this thread in your pull request, so that it’s easier to understand what it does.

Our policy is that two developers need to review and sign-off the PR before it can be merged and they might have some requests regarding code style and so on.

Best regards
Jörg

Dear Joerg,

thanks for the answer!

That’s excellent news :smiley:

Just go ahead and create a pull request. There’s no mandatory
requirement to have an issue as well. It would be best if you could link
to this thread in your pull request, so that it’s easier to understand
what it does.

Great! I’ll do that.

Our policy is that two developers need to review and sign-off the PR
before it can be merged and they might have some requests regarding code
style and so on.

Good policy :slight_smile:

Regards,
Saulius