VidHarm: A dataset for detection of harmful content in video

Johan Edstedt1, Amanda Berg1, Michael Felsberg1, Johan Karlsson2, Francisca Benavente2, Anette Novak2, Gustav Grund Pihlgren3
1Computer Vision Laboratory, Linköping University   
2Statens Medieråd, Stockholm   
3EISLAB Machine Learning, Luleå University of Technology   
Some examples from the dataset

Introduction

VidHarm is a professionally annotated dataset for detection of harmful content in video. We have annotated 3589 video clips from a variety of film trailers. In contrast to previous approaches which mostly use meta data from long sequences, we use the raw video and focus on short clips. We use the Swedish system for harmful content classification. For more information on the system used, see here. We provide additional details of the dataset in our paper, which is available at arxiv.

Download Instructions

We provide instructions for downloading annotations, clips, full trailers, and a pre-processed version of the dataset at our github repository.

BibTeX


      @article{edstedt2021harmful,
        title={VidHarm: A Clip Based Dataset for Harmful Content Detection},
        author={Edstedt, Johan and Berg, Amanda and Felsberg, Michael and Karlsson, Johan and Benavente, Francisca and Novak, Anette and Pihlgren, Gustav Grund},
        journal={arXiv preprint arXiv:2106.08323},
        year={2021}
      }