The Altmetric team

Improved logic to track ISBNs in policy documents

AUTHOR: The Altmetric team

1. What's new?

We improved our logic for detecting ISBNs in policy documents, to maximize our ability to extract valid ISBNs, while at the the same time minimizing the risk of incorrect book mentions.

2. Value and user impact

As part of our continuous effort to monitor and improve the quality of our data, we found that, historically, we have been tracking a number of invalid book mentions from the policy documents in our system.

Some policy documents - especially those with several tables included - may feature long numbers entered as free text. In some cases, these long numbers match real ISBNs that belong to published books.

This is why, in the Altmetric Explorer, we noticed mentions of books that are completely unrelated to the policy documents we tracked them from, as in the example below:

To minimize the risk of logging incorrect book mentions, we improved our logic for detecting ISBNs in policy documents: our system now only extracts ISBN-10 and ISBN-13 codes from policy documents as long as they are prefixed with the letters ISBN. This solution will maximize our ability to extract valid ISBNs, while at the the same time ignoring free-text numbers that may lead to incorrect mentions.

We have now reprocessed all mentions for all our existing policy sources, to remove any incorrect book mentions we are now able to prevent with the new rule. As a result of this data cleanup effort, users monitoring policy documents may notice a drop in overall policy documents we track mentions from (as Altmetric products will no longer show the documents that only had incorrect book mentions), or mentions from specific policy sources.

  • Before reprocessing: 4,402,575 mentions (from 287,501 individual posts)

  • After reprocessing - 2 October 2024: 4,431,584 mentions (from 282,048 individual posts)

While there is a small risk that valid ISBNs are mentioned as free text (i.e. not prefixed with the letters ISBN), we believe this to be minimal. We will be happy to investigate potential instances of missed mentions case by case. Please contact the Altmetric Support team for any questions about policy mentions in the Altmetric Explorer.

Powered by LaunchNotes