DATE:: October 3, 2022
AUTHOR:: The Altmetric team

Explorer For Institutions (EFI)

OAI-PMH support for DSpace, Esploro, DiVA and Haplo

DATE: October 3, 2022

AUTHOR: The Altmetric team

Many of the institutional repositories our customers use provide metadata formats we can support, enabling our customers to track authors, research outputs and departments in the Explorer.

However, some repositories provide this data in a format we are unable to support, or can only partially import, so that customers are unable to display the information in the way that is most useful to them.

So far, customers using a repository our OAI functionality could not fully support have had three alternatives.

Importing data about their authors and research outputs via an OAI feed, without creating a department hierarchy in the Explorer
Uploading a CSV file with all their author, output and department data to the Explorer, and updating it manually on a regular basis
Using the Altmetric Explorer with no data integration at all, and therefore without the ability to track their own authors, outputs and departments

These can involve manual work on the organisations’ side, or prevent customers from tracking data they would find valuable to see in the Explorer.

We have now built additional support for a number of widely used repositories: DSpace, Esploro, DiVA, and Haplo.

We expect this work will make our OAI connector functionality more flexible and robust for more organisations, whether they’re existing Altmetric customers not utilising our integrations to their full potential, or whether they are evaluating opportunities to implement Altmetric in the future.

1. What's new?

The tables below summarise the support historically available for these repositories, and the improvements we are now able to offer.

1.1. DSpace

Customers can now have their communities and collections hierarchy displayed in the Departments view of the Explorer. They will need to enable the hierarchy API endpoint at their end first.

1.2. Esploro

Customers can have their Department hierarchy displayed in the Explorer. They will need to set up publishing profiles at their end first, and use the oai_dc metadata prefix.

1.2. DiVA

The Explorer integration now supports attributes passed on by DiVA to distinguish research outputs and authors for specific universities (as DiVA is a repository shared by multiple institutions), and affiliation information to replicate department hierarchies from research output data
Customers using DiVA can choose their preferred combination of configuration options, including:
- Importing departments via a JSON / CSV file, instead of an OAI feed
- Translating department names
- Department remapping (e.g. to consolidate old and new department names in the Explorer)
- Filtering out specific paper types (e.g. student theses)

1.4. Haplo

Customers can configure their Explorer integration so that they only import authors that are currently affiliated with their institutions

2. Technical details

2.1. DSpace

We have now added API support for Department imports from DSpace to our standard OAI connector.

Customers using DSpace will need to ensure that the api/hierarchy endpoint is enabled, as we will build their hierarchical structure using this data, and not from ListSets. If they are unable to enable this endpoint, we can offer a departmentless import as an alternative.

A key differentiator is that DSpace supports both collections and communities. For the purposes of our EFI integration, both of these entities are treated as departments.

2.2. Esploro

Esploro requires that a setSpec is always provided when harvesting records from a customer’s repository.

This means that, in order to sync over research outputs and replicate their departmental structure within the Explorer, customers will need to create publishing profiles for each department. If they are unable to do this, we can offer a departmentless import as an alternative. If customers require a departmentless instance because of the selective harvesting requirement, a single publishing profile will still be required.

The Esploro repository also exposes a number of other profiles via ListSets - for example, BrowZine and Unpaywall. To exclude these from appearing within the Explorer, customers will need to prefix the setName of the publishing profiles they do want to see (e.g. with Department = or a similar formatting, to distinguish from sets that should not be imported) . This can be seen in the example below:

To have a department for the Faculty of Science for which there is a dependent department School of Biology, the customer will need a publishing profile for each. This would result in the ListSet records listed below.
Because of the filtering applied using Department = , the BrowZine wouldn’t be synced in this case.

<set>

<setSpec>Faculty of Science</setSpec>

<setName>Department = Faculty of Science</setName>

</set>

<set>

<setSpec>Faculty of Science:School of Biology</setSpec>

<setName>Department = School of Biology</setName>

</set>

<set>

<setSpec>BrowZine</setSpec>

<setName>BrowZine</setName>

</set>

This would then result in the following structure within the Explorer:

2.3. DiVA

We have added support for the swepub_mods metadata prefix, commonly used by DiVA customers. As DiVA is a shared repository, used by around 50 Swedish institutions, swepub_mods is designed to support multiple institutions (i.e. as opposed to the standard oai_dc metadata format, which supports one customer per repository). It includes:

Attributes to distinguish papers and authors for specific universities
Affiliation information to replicate department hierarchies from research output data

The Explorer now has the capability to integrate with DiVA and include these types of data, so that it’s possible for each individual institution to focus the import on their own author, research output and department information.

We have also added support for a number of features designed to enhance flexibility and automation:

Importing departments via a JSON / CSV file, instead of an OAI feed, to create department structures that differ from the hierarchies recorded in DiVA (a guide to this feature is available on the Altmetric Support portal)
Translating department names (e.g. from Swedish to English, to remove language barriers for researchers who do not speak Swedish). This requires a separate CSV file import
Department remapping, allowing customers to link department names that are no longer in use with the most up-to-date names they want to see in the Explorer. Regularly editing department names on their records so that only up-to-date departments are passed on to Explorer would involve a heavy admin workload at their end, and this feature allows them to automate the process. This requires a separate CSV file import.
Filtering out paper types the institution does not want to see in the Explorer. We heard a number of institutions have a need for filtering out student theses in particular; the filtering feature can also be extended to different paper types, based on customer needs (as of current, only studentThesis can be filtered out, and we can investigate adding more paper types on request).

Customers can choose to implement as many of these features as they need when setting up their integration between DiVA and the Explorer.

3.4 Haplo

We have added support for the oai_datacite metadata prefix, commonly used by Haplo customers.

Compared with the standard oai_dc metadata format, oai_datacite exposes an additional affiliation xml tag. This tag enables customers to configure their Explorer integration so that they only import authors that are currently affiliated with their institutions.

The current version of the Haplo importer only supports integrations that use the oai_datacite metadata prefix. While Haplo supports multiple other metadata formats, these are not currently supported by the Altmetric Explorer.

4. Implementation process

Details about available options are available in the Implementation Guide (password required).

Our external documentation about DiVA features is available on the Altmetric Support portal:

To begin an implementation, customers can contact us at [email protected]. Our Support team will arrange a full data review and confirm we can support their requirements before planning to set up an integration.