Article 3 – Text and Data Mining

I have 1 minute
I have 10 minutes
I have 1 hour

“Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns.” – UK Government

A good text and DATA mining exception

approval-stamp

Is mandatory

Only if all EU Member States are obliged to implement the same TDM exception, can you be sure that the same rules apply when sharing TDM based knowledge across borders (even via email), collaborating with researchers from other countries, or simply putting content online.

States that, if you have lawful access, the right to read should be the right to mine

The threshold to determine whether mining can be conducted should simply be: does one have lawful access to the content one wants to mine, either because the content was acquired directly or through a licensing agreement, or because it is in the public domain or publicly available. Who reads the content and for what purpose should not be relevant. The right to read should be the right to mine.

Includes safeguards against abusive technical measures

Allowing rightholders to implement vaguely defined technical measure to ensure the security and integrity of their servers and networks creates a major loophole for abuses. Whilst such measures should be allowed, they should be clearly defined and framed in order to be proportionate, efficient, and non-discriminatory (applying in the same manner to the publisher’s mining services as to externally-run algorithms).

Does not allow contractual or technical overrides

There is no point adopting an exception for TDM if publishers can just override it in their licensing terms or through the use of technical protection measures (also known as DRM for ‘digital rights management’). This must be explicitly specified in the adopted legislation.

A bad text and data mining exception

rejection-stamp

Is voluntary

If the implementation of a TDM exception is left to the good will of each EU Member State, this would result in a patchwork of different rules across the EU. The result would create legal uncertainty when sharing or collaborating across borders and when posting content online.

Is limited in terms of beneficiaries and/or purposes

Limiting the benefit of the exception to research organisations acting for scientific purposes is extremely unsatisfactory. It affects the possibilities for citizen science to take place, as well as for commercial entities to conduct mining, including spin-offs from universities or mining activities conducted in the framework of a public-private partnership. It also creates legal uncertainty with regard to what is deemed a research organisation and what ‘scientific purposes’ entail.

Does not allow content back-up

Requiring that the content that is mined to extract facts and data is then immediately destroyed is counter-intuitive from a research/investigative perspective. Being able to show which sources were used is what builds up the trustworthiness of the acquired knowledge from mining, provides evidence for one’s findings in case of dispute, and allows colleagues to mine the same data in order to see if similar results emerge.

Can be overridden by contracts or technical measures

There is no point adopting an exception for TDM if publishers can just override it in their licensing terms or through the use of technical protection measures (also known as DRM for ‘digital rights management’). This must be explicitly specified in the adopted legislation.