How I use reference management software

March 23, 2021 - grad-school my-phd-grind mysterious-grad-school

At some point, I realized that I just can't manage all the papers I have read as pdfs on my laptop or keep printed versions. To help, I started using reference management software and e-ink reader.

Journey with Reference Management Software

I first started using Papers3 on Mac, which was introduced by some coworker a few years ago. Papers3 was great until ReadCube stopped providing more support, discontinued development, and decided to make some web based next generation. Papers3 has some pros and cons but it works reasonably well for me --- the big issue I had was mostly about the lack of improvements and the weird web version.

Pros

Papers3 supports automatically fetching bibtex information with just a PDF. It was not very accurate but I think it is good enough.
Papers3 supports highlighting and making notes in PDF.
Papers3 also supports syncing via saving a physical copy (works with Apple's TimeMachine or Dropbox).

Cons

Papers3 only works on Mac, which means I can't use it on my Linux laptop (a Linux version is claimed to )
Papers3 just became more and more buggy as the Mac OS update it versions. I might have to restore from TimeMachine a few times so I can make Papers3 work again.

Eventually I switched to something more open-source and cross-platform.

Mendelay vs Zotero

There are several popular options that would work on multiple platforms and are more open-source (see a list of apps and their comparison). The first thing I tried was Mendelay, but thought it was kind of an overkill for what I want.

My Requirements

The ref management app needs to support both Mac and Linux (sorry, I don't care about Windowns and decided to not use it years ago).
The app should support syncing from different devices, ideally not through Dropbox or Google Drive.
The app should have some nice ways to keep track of all kinds of bibtex entries.
And the app should have some ways to make notes or edit PDFs.

Of those requirements, I can live with a non-perfect 4 but 3 has to work. You can see from the example ACM publication and USENIX publication to see how weird there are so many items in a paper's bibtex entry.

The ACM format of a bibtex entry:

@article{10.1145/3208104,
	author = {Mace, Jonathan and Roelke, Ryan and Fonseca, Rodrigo},
	title = {Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems},
	year = {2018},
	issue_date = {December 2018},
	publisher = {Association for Computing Machinery},
	address = {New York, NY, USA},
	volume = {35},
	number = {4},
	issn = {0734-2071},
	url = {https://doi.org/10.1145/3208104},
	doi = {10.1145/3208104},
	abstract = {Monitoring and troubleshooting distributed systems is notoriously difficult; potential problems are complex, varied, and unpredictable. The monitoring and diagnosis tools commonly used today—logs, counters, and metrics—have two important limitations: what gets recorded is defined a priori, and the information is recorded in a component- or machine-centric way, making it extremely hard to correlate events that cross these boundaries. This article presents Pivot Tracing, a monitoring framework for distributed systems that addresses both limitations by combining dynamic instrumentation with a novel relational operator: the happened-before join. Pivot Tracing gives users, at runtime, the ability to define arbitrary metrics at one point of the system, while being able to select, filter, and group by events meaningful at other parts of the system, even when crossing component or machine boundaries. We have implemented a prototype of Pivot Tracing for Java-based systems and evaluate it on a heterogeneous Hadoop cluster comprising HDFS, HBase, MapReduce, and YARN. We show that Pivot Tracing can effectively identify a diverse range of root causes such as software bugs, misconfiguration, and limping hardware. We show that Pivot Tracing is dynamic, extensible, and enables cross-tier analysis between inter-operating applications, with low execution overhead.},
	journal = {ACM Trans. Comput. Syst.},
	month = dec,
	articleno = {11},
	numpages = {28},
	keywords = {end-to-end tracing, Distributed systems monitoring}
}

The USENIX format of a bibtex entry:

@inproceedings {199352,
	author = {Aurojit Panda and Sangjin Han and Keon Jang and Melvin Walls and Sylvia Ratnasamy and Scott Shenker},
	title = {NetBricks: Taking the V out of {NFV}},
	booktitle = {12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16)},
	year = {2016},
	isbn = {978-1-931971-33-1},
	address = {Savannah, GA},
	pages = {203--216},
	url = {https://www.usenix.org/conference/osdi16/technical-sessions/presentation/panda},
	publisher = {{USENIX} Association},
	month = nov,
}

As I find out later, Zotero (pairing with the Better Bibtex plugin and browser extension) can achieve all of my requirements.

Tips and Tricks of Using Zotero

Browser Extension

The browser extension of Zotero is one of the fantastic things that you should use immediately. Basically, the extension can extract the whole bibtex entry and the paper pdf once getting to a publication webpage on USENIX or ACM. I don't think I need to explain more why this is fantastic :)

Better Bibtex Plugin

The Better Bibtex plugin is another useful tool that allows you to modify the bibtex cite key as you want. As we see before, the generated key from ACM or USENIX is basically not rememberable. For more, you can read the Better Bibtex FAQ

Last, I find out that the extracted title will not capitalize words properly, which can be fixed with some tunning in the hidden preference of Zotero.

Magic option

A couple of related issues: