Classifying Community QA Questions That Contain an Image

玉置 賢太 (早大)、富樫 陸、藤田 澄男、河東 宗祐 (早大)、前田 英行、酒井 哲也 (早大)

第10回データ工学と情報マネジメントに関するフォーラム(第16回日本データベース学会年次大会、DEIM 2018), 2018/3


Image Processing Information Retrieval

We consider the problem of automatically assigning a category to a given question posted to a Community Question Answering(CQA) site, where the question contains not only text but also an image. For example, CQA users may post a photograph of a dress and ask the community “Is this appropriate for a wedding?" where the appropriate category for this question might be “Manners, Ceremonial occasions." We tackle this problem using Convolutional Neural Networks (CNNs) with Multimodal Compact Bilinear (MCB) pooling for combining the image and text networks. Our experiments with real data from a major CQA site and crowdsourced gold-standard categories show that our method that combines MCB with a simple sum and element-wise product approach statistically significantly outperforms a baseline that relies only on text at alpha = 0.10 (p-value: 0.091; effect size: 0.046).

Classifying Community QA Questions That Contain an Image(External Site Link)