Abstract: |
Advertisements, especially in online social media, are often based on visual and/or textual persuasive messages, frequently showing women as subjects. Some of these advertisements create a biased portrays of women, finally resulting as sexist and in some cases misogynist. In this paper we give a first insight in the field of automatic detection of sexist multimedia contents, by proposing both a unimodal and a multimodal approach. In the unimodal approach we propose binary classifiers based on different visual features to automatically detect sexist visual content. In the multimodal approach both visual and textual features are considered. We created a manually labeled database of sexist and non sexist advertisements, composed of two main datasets: a first one containing 423 advertisements with images that have been considered sexist (or non sexist) with respect to their visual content, and a second dataset comprising 192 advertisements labeled as sexist and non sexist according to visual and/or textual cues. We adopted the first dataset to train a visual classifier. Finally we proved that a multimodal approach that considers the trained visual classifier and a textual one permits good classification performance on the second dataset, reaching 87% of recall and 75% of accuracy, which are significantly higher than the performance obtained by each of the corresponding unimodal approaches. |