by holzijue, Pixabay licence (https://bit.ly/3BarIQN)

Working group for open biodiversity databases

Head: Florian Heigl, University of Natural Resources and Life Sciences, Vienna

During the annual platform meeting of the Citizen Science Network Austria on 31.01.2018, the participating partners decided to establish a working group on open biodiversity databases.
The following points have encouraged us to deal with this topic:

  1. An ethical dilemma arises when data that has been collected together with the population is stored in a locked database.
  2. In the future, certain funding programs will often require that the data collected in the course of the project be made available as open data.
  3. The EU GDPR requires us to reconsider and update our handling of personal data in the near future.
  4. Consideration of the topic on the part of the project operators shows the progressive thinking in the field of citizen science.
  5. If we identify the challenges/problems together, it is easier to find arguments for/against opening the databases in citizen science projects.
  6. Technical developments in the field of infrastructure offer new opportunities to publish research data (e. g. https://www.gbif.org/).

Just to emphasize: the WG should not be a missionary for the opening of biodiversity databases. We want to show objectively which problems/challenges will arise if the databases are opened and which ways there might be to protect vested interests or sensitive data on protected goods and still openly provide data.

The following goals are to be achieved in the working group:

  1. Development of a questionnaire to help assess the feasibility/sensitivity of opening specific citizen science biodiversity databases (tested in existing and theoretical projects).
  2. An implementation report or case study from an Austrian citizen science project that opened its biodiversity database.

If you are interested in this topic and would like to join the working group, then please contact Florian Heigl at any time (This email address is being protected from spambots. You need JavaScript enabled to view it.).

Catalogue of questions for project managers:

The catalogue of questions for project managers version 1.0 can be downloaded for free in German and English.

Report on implementation and experience

In the Roadkill project of the University of Natural Resources and Life Sciences Vienna, 912 Citizen Scientists reported 17,163 roadkills from 2014-2020. This Austrian citizen science project was selected to try to open up its biodiversity database and to document the hurdles that had to be overcome.

The first step was to identify which repository, i.e. public database, would be most suitable for the collected roadkill data. We decided to publish the highest quality data on GBIF. GBIF - the Global Biodiversity Information Facility - is an international network and data infrastructure funded by the world's governments that aims to provide open access to data on all species of life on Earth to anyone, anywhere. We wanted to publish the quality level 2 data on Zenodo. Zenodo is an open-discipline repository, based at CERN and funded by the European Commission. 

In contrast to GBIF, Zenodo does not have any specifications regarding the properties or formats of the data. Publishing data via Zenodo is therefore very simple and straightforward. GBIF currently only allows organisations to publish data, and only data that meets the strict biodiversity data standards accepted by GBIF

The first hurdle was to find an organisation that was willing to publish the data from the Roadkill project. We finally found this in the Biology Centre of the Upper Austrian State Museum, which runs the database ZOBODAT, whose data are also feed into GBIF. Besides the Biology Centre, there are many other organisations in Austria that publish data in GBIF. Another way would be to register your own organisation on GBIF and host the data yourself.

The second hurdle was to bring the collected data into a data standard requested by GBIF. This required a lot of time resources and could be avoided by introducing the appropriate standard for data collection at the start of the project.

Another important step was to describe exactly how the published data was collected and checked for accuracy, so that researchers and other interested parties could understand how the published data was collected and then use it for their own research or conservation actions. We have published this description of the data in the form of a data paper in the international scientific journal Scientific Data. Such a publication is optional and does not have to be done via a peer-reviewed journal. One can also add such a description in an abbreviated form directly in GBIF.

The experiences described above show that the publishing of biodiversity data from citizen science projects can be challenging if the data were not collected according to the specifications of the respective repository. If possible, the repository in which the data is to be published should be determined at the start of the project in order to simplify the publication process. It remains to be seen which advantages will result from publication. However, we are convinced that the publication will contribute to the fact that the time invested by citizen scientists in data collection will be even more appreciated, as the data can now be used not only for the Roadkill project but also for other research projects, thus creating added value.