GetFTR: dataflows and user privacy

Data privacy is a topic that, rightly so, is top-of-mind for many researchers and organisations that are active in the space of scholarly communications. GetFTR is committed to safeguarding the privacy of its users, and we take this principle to heart as we continue to deliver on our mission to create streamlined access to high-quality publications for global researchers, no matter their location.

We have previously written about how GetFTR preserves user privacy as we expanded access options in a blog post towards the end of last year.

In today’s post, two of GetFTR’s leadership team – Dianne Benham and Hylke Koers – discuss the topic further and go into more detail on the inner workings of GetFTR. We’ve decided to talk not only about general principles but also to include a fair bit of technical detail. We believe that it is important to be clear and transparent about what kind of data is needed for the GetFTR service to work, and how that data flows between the central GetFTR system, services that integrate with GetFTR, and publishers.

####

Recap: What is GetFTR?

GetFTR is a service that streamlines access to trusted, published research from discovery tools and collaboration networks, a mission that we believe is more important and urgent than ever. With GetFTR, users from such services can quickly tell which published content is accessible to them (which could be because it is Open Access, or because their institution has subscribed to the content), and follow links to rapidly access published research on participating publisher or aggregator websites. This reduces unnecessary steps, paywalls and frustration.

For an overview of GetFTR and what problems it solves, we recommend watching the video “An introduction to GetFTR” .

At its core, GetFTR is a service that answers the question “Can someone from this institution access these publications?” In more technical terms, such a question is referred to as an entitlement check, and GetFTR offers the capability to perform entitlement checks for a list of publications (identified by DOIs) and a specific institution. This enables GetFTR integrators, such as discovery tools and collaboration networks, to signal to their users which content they are entitled to access.

GetFTR only requires article identifiers (DOIs) and information about the user’s institutional affiliation for entitlement checks. It does not require or capture any other information about the user.

The mechanics of how these entitlement requests are handled is described in the illustration before:

You can see more about how GetFTR works in our animated walk through

Step 1: Integrator Entitlement Requests

The process starts when a researcher begins their discovery journey on a service that integrates with GetFTR, and that service wants to display their entitlement to the user. To achieve this, the integrator service queries the GetFTR system with a combination of DOI’s and the user’s affiliation.

The user’s affiliation can be encoded by a number of organisational identifiers which are collected by the integrator and passed to GetFTR (and, subsequently, on to the publisher – see below) for entitlement checking purposes. One such standard identifier is the SAML EntityID corresponding to a user’s institution; this information will have been provided to the integrator directly by the user, for example in their profile (see figure below).

Institution selection – illustration of how a user may enter their affiliation on the website of a service integrating with GetFTR. This enables the integrator to query the GetFTR system to see if the user is entitled, via their institution, to a specific piece of content

Another commonly used identifier is the user’s IP address. Support for IP-based authentication was added to GetFTR after the initial launch upon request from users and in consultation with pilot participants and the GetFTR advisory board.

This means that integrators can also share user’s IP addresses with GetFTR, although this is optional. Those that choose to share user’s IP addresses with GetFTR and participating publishers, have to notify users via their privacy policy ahead of doing so. GetFTR and publishers who receive IP address information from GetFTR are only allowed to use this for the purposes of checking entitlements.

Note: Most Discovery Services implement deferred authentication, which means the user authenticates after they have clicked on a GetFTR link to access content on a publisher or aggregator site (unless they are already authenticated). Some Discovery Services chose to handle authentication themselves. Once the user is authenticated, GetFTR links provide one-click access to content across all participating publisher and aggregator platforms.

Step 2: GetFTR Processing

When GetFTR receives an entitlement request, it checks the entitlement status with the relevant publisher in real-time. To enable this, GetFTR uses Crossref metadata to determine which publisher is associated with a particular piece of content (identified by DOI), and then routes the entitlement request to the correct publisher. Only the relevant publisher will receive the entitlement request; this data is not shared with other publishers or organisations.

Step 3: Publisher Entitlement Responses

For every document, the publisher returns a corresponding “entitlement resource” to GetFTR, which is a piece of information that establishes the level of entitlement (e.g. yes, no); access type (e.g. open, free, paid); document type (e.g. version of record or alternative version); content type (e.g. html, pdf) and a link to the actual resource if appropriate. GetFTR, in turn, sends this entitlement resource on to the integrator where it can be used to signal to the user if they will have access to the content.

Step 4: Integrator display entitlement

Having received the entitlement resource, the integrator now knows which content items the user has access to, and uses that information to display the appropriate GetFTR indicators. In addition, if the user is entitled to access the content, the publisher sends a ‘smart link’ that allows the user to bypass repetitive authentication steps.

What does GetFTR not do?

GetFTR does not store information about its users, which links the user clicks on the integrator’s site, or which articles they view on a publisher’s platform

GetFTR also does not control or mediate access in any way. Once a user clicks on a link from an integrator, they are directed to the publisher’s platform where they still need to be authenticated and authorized. For users using SAML based-access (federated access), GetFTR does streamline the access user journey by providing ‘smart links’ (which are in essence so-called WAYFless URLs) to the integrator. Such a smart link contains the user’s institutional affiliation, but no other user data, thereby effectively passing that information along from an integrator service to a publisher. The publisher, in turn, can use this information to direct the user to their institution’s login page or directly to the content if they are already authenticated.

As always we welcome further discussion should you have thoughts, comments or want to know more information. For this please contact Dianne Dianne Benham (GetFTR Product) Dianne@getfulltextresearch.com