VidHarm: A dataset for detection of harmful content in video

Johan Edstedt¹, Amanda Berg¹, Michael Felsberg¹, Johan Karlsson², Francisca Benavente², Anette Novak², Gustav Grund Pihlgren³

¹Computer Vision Laboratory, Linköping University

²Statens Medieråd, Stockholm

³EISLAB Machine Learning, Luleå University of Technology

Paper Code

Some examples from the dataset

Introduction

VidHarm is a professionally annotated dataset for detection of harmful content in video. We have annotated 3589 video clips from a variety of film trailers. In contrast to previous approaches which mostly use meta data from long sequences, we use the raw video and focus on short clips. We use the Swedish system for harmful content classification. For more information on the system used, see here. We provide additional details of the dataset in our paper, which is available at arxiv.

Download Instructions

We provide instructions for downloading annotations, clips, full trailers, and a pre-processed version of the dataset at our github repository.

BibTeX


      @article{edstedt2021harmful,
        title={VidHarm: A Clip Based Dataset for Harmful Content Detection},
        author={Edstedt, Johan and Berg, Amanda and Felsberg, Michael and Karlsson, Johan and Benavente, Francisca and Novak, Anette and Pihlgren, Gustav Grund},
        journal={arXiv preprint arXiv:2106.08323},
        year={2021}
      }