SimilCatch : Enhanced social spammers detection on Twitter using Markov Random Fields

Nour El-Mawass; Paul Honeine; Laurent Vercouter

doi:10.1016/j.ipm.2020.102317

Article Dans Une Revue Information Processing and Management Année : 2020

SimilCatch : Enhanced social spammers detection on Twitter using Markov Random Fields

(1, 2) , (1) , (2)

1
2

Nour El-Mawass

Fonction : Auteur

Equipe Apprentissage

Equipe Multi-agent, Interaction, Décision

Paul Honeine

Fonction : Auteur
PersonId : 171096
IdHAL : paul-honeine
ORCID : 0000-0002-3042-183X
IdRef : 13564609X

Equipe Apprentissage

Laurent Vercouter

Fonction : Auteur
PersonId : 18708
IdHAL : laurent-vercouter
IdRef : 060396024

Equipe Multi-agent, Interaction, Décision

Résumé

The problem of social spam detection has been traditionally modeled as a supervised classification problem. Despite the initial success of this detection approach, later analysis of proposed systems and detection features has shown that, like email spam, the dynamic and adversarial nature of social spam makes the performance achieved by supervised systems hard to maintain. In this paper, we investigate the possibility of using the output of previously proposed supervised classification systems as a tool for spammers discovery. The hypothesis is that these systems are still highly capable of detecting spammers reliably even when their recall is far from perfect. We then propose to use the output of these classifiers as prior beliefs in a probabilistic graphical model framework. This framework allows beliefs to be propagated to similar social accounts. Basing similarity on a who-connects-to-whom network has been empirically critiqued in recent literature and we propose here an alternative definition based on a bipartite users-content interaction graph. For evaluation, we build a Markov Random Field on a graph of similar users and compute prior beliefs using a selection of state-of- the-art classifiers. We apply Loopy Belief Propagation to obtain posterior predictions on users. The proposed system is evaluated on a recent Twitter dataset that we collected and manually labeled. Classification results show a significant increase in recall and a maintained precision. This validates that formulating the detection problem with an undirected graphical model framework permits to restore the deteriorated performances of previously proposed statistical classifiers and to effectively mitigate the effect of spam evolution.

Mots clés

Social spam detection Online social networks Twitter Supervised learning Markov random field Cybersecurity

Domaines

Machine Learning [stat.ML] Traitement du signal et de l'image [eess.SP] Statistiques [math.ST] Traitement du signal et de l'image [eess.SP] Réseau de neurones [cs.NE] Apprentissage [cs.LG] Ordinateur et société [cs.CY] Vision par ordinateur et reconnaissance de formes [cs.CV] Intelligence artificielle [cs.AI]

Fichier principal

20.mrf.pdf (1.46 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Paul Honeine : Connectez-vous pour contacter le contributeur

https://normandie-univ.hal.science/hal-03088293

Soumis le : samedi 26 décembre 2020-00:04:53

Dernière modification le : lundi 22 avril 2024-15:39:57

Archivage à long terme le : lundi 29 mars 2021-16:41:11

Dates et versions

hal-03088293 , version 1 (26-12-2020)

Identifiants

HAL Id : hal-03088293 , version 1
DOI : 10.1016/j.ipm.2020.102317

Citer

Nour El-Mawass, Paul Honeine, Laurent Vercouter. SimilCatch : Enhanced social spammers detection on Twitter using Markov Random Fields. Information Processing and Management, 2020, 57 (6), pp.102317. ⟨10.1016/j.ipm.2020.102317⟩. ⟨hal-03088293⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSA-ROUEN LITIS COMUE-NORMANDIE TDS-MACS UNIROUEN UNILEHAVRE INSA-GROUPE

40 Consultations

591 Téléchargements

SimilCatch : Enhanced social spammers detection on Twitter using Markov Random Fields

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager