The files begun with underscore were different in some way. 

I don't recall how.

Perhaps pdfminer failed to parse the highlights from their abstracts and we had to handle them entirely manually?
