Project: LaTeX Integration

I am interested in LaTex integration project for GSoC this year. I would like to know more about the project, will the user import a LaTex file as a new library and then JabRef would analyze the file for all bibliographic references and create corresponding entries in the library?
The expected outcome for the project also states about tracing the references in auxiliary files, I am not clear on what exactly are these auxiliary files and how JabRef would interact with these files, could you please elaborate that part?

Hi Yash,
one could think of many possibilities to improve the integration of JabRef with the current LaTeX project an author is working on. For example, some users have one big LaTeX bibliography they use for several paper projects. Wouldn’t it be cool to see how often and which paper one has cited in all these articles?
To enable such use-cases one would need to parse LaTeX and further auxiliary files, (.aux, bbl., etc.) to get the required information.
Another idea would be to incorporate further tools that improve the quality of the LaTeX code. For example textools could be integrated into JabRef.

Cheers,
Linus

1 Like

Hi Linus, thanks for writing back. I read about aux and bbl files and understood that these files contain references for the LaTeX and BibTeX files.

However, I am still not clear on how the user will interact with JabRef. Will the user import a single LaTeX file into the current library which contains a number of bibliographical entries, and based on these entries the user is presented with a statistic?
Or, Would the user import an entire project containing a number of LaTeX files and BibTeX files, and JabRef would parse the entire project and present the user with the statistics?

P.S: I’ve not worked on LaTeX projects before, hence, I am not sure how the projects are usually structured.

Suppose we have a simple LaTeX document:

\documentclass{article}

\begin{document}

Random citation \cite{DUMMY:1} embeddeed in text.\newline
Random citation \cite{DUMMY:2} embeddeed in text.\newline
Random citation \cite{DUMMY:1} embeddeed in text again.\newline
Random citation \cite{DUMMY:2} embeddeed in text again.\newline

\newpage

\bibliography{test} 
\bibliographystyle{ieeetr}

\end{document}

and a BibTeX document:

@BOOK{DUMMY:1,
  AUTHOR="John Doe",
  TITLE="The Book without Title",
  PUBLISHER="Dummy Publisher",
  YEAR="2100",
}

@BOOK{DUMMY:2,
  AUTHOR="Kane Doe",
  TITLE="The Book without Title 2",
  PUBLISHER="Dummy Publisher 2",
  YEAR="2101",
}

The aux and bbl files generated are:
aux:

\relax 
\citation{DUMMY:1}
\citation{DUMMY:2}
\citation{DUMMY:1}
\citation{DUMMY:2}
\bibdata{test}
\bibcite{DUMMY:1}{1}
\bibcite{DUMMY:2}{2}
\bibstyle{ieeetr}

bbl:

\begin{thebibliography}{1}

\bibitem{DUMMY:1}
J.~Doe, {\em The Book without Title}.
\newblock Dummy Publisher, 2100.

\bibitem{DUMMY:2}
K.~Doe, {\em The Book without Title 2}.
\newblock Dummy Publisher 2, 2101.

\end{thebibliography}

I am assuming the LaTeX, bbl and aux files are provided by the user.
Some of my initial observations:

  • The number of times a citation is used in a LaTeX file would be the number of times that particular citation occurs in the aux file.

  • To find the position of those citations in the LaTeX file, regular expressions could be used searching for pattern of the form \cite*{bibtexKey}.

Edit: I just noticed that JabRef has a tool for aux import which extracts entries from Bibdatabase present in aux file.

I guess, from the user perspective the easiest is to once specify a folder containing all the latex files one is interested in. JabRef should then walk through each latex file and generate a corresponding list of cited bibentries.

One could use the aux file for this, but as you noticed it does not provide any context (i.e. surrounding text). Moreover, the aux files are only generated and updated if one runs the latex compiler. Thus, I would prefer if one directly works on the latex file.

2 Likes

A typical user probably uses 3 types of software:

  • JabRef :wink:
  • A LaTeX distribution (one among several)
  • A LaTeX editor (several are already known to JabRef)

As a user, when I read “JabRef and LaTeX integration”, I think about what JabRef can extract from LaTeX files (as discussed above), but also what the LaTeX editor could ask JafRef for. For example, the LaTeX editor could ask JabRef:

  • to go to entry “Dummy:1” in the entry editor.
  • to open the PDF of entry “Dummy:1”.
  • to go to the field year of entry “Dummy:1” (because the log file contains a warning about a missing year in this entry)

Does JabRef can currently do such things? Through the CLI maybe?

1 Like

Thank you for clarifying, @tobiasdiez.

Here is my initial idea on how the tool will be used:

  • The user selects the directory containing the LaTeX files, JabRef parses all LaTeX files and creates a set of citations. (Each item in the set contains the citation key, a list of file names where the citation key is used, a list of line numbers for each file where the citation appears).

    For parsing, I am thinking of reading each file line by line and searching for \cite*{(.*)} pattern, and once the key is obtained add it to a set of citation keys.

  • The interface that I have in mind looks like this:

    The citations keys pane shows all citations that are obtained when parsing the LaTeX files, when the user clicks on a citation key, the files pane is updated with the right files and the content code of each LaTeX file can be viewed and edited.

    Also, when a LaTeX file is viewed a find next option is available (similar to the one present in web browsers) to jump to lines where the citation key is used.

@mlep Thanks for the feedback :).
I haven’t really used the CLI of JabRef, but a quick glance over the CLI documentation tells me that JabRef does not provide that functionality, yet.
It sounds like a nice functionality to have, however, is it not common to have JabRef open when editing LaTeX project? Of-course one would still have to search for the right entry but the global search could come in handy then.

Indeed, it is quite common to have JabRef opened and to use its global search.
That is exactly the point: it is common because of the lack of integration! :wink:

1 Like

Biblatex styles have the nice option to create backreferences, e.g. when you are in the references section you can click on the generated Reference in the pdf, e.g. Doe John, Book wihtout Title, 2010. Dummy Publisher. and when you click on it, you can jump to the passage where it is cited.
e.g. Sample citation (Doe, 2019)

I am planning on something similar.
Once a citation is selected (after parsing), the LaTeX files containing that citation are shown, you can then click on any one of those files and jump to the line containing the selected citation. You can also go to the next instance of that citation in the same file, by clicking, for e.g, the find next button.

1 Like

Here is a mock UI I put together:

Am I going in the right direction with the project?

1 Like