Search results

1 – 1 of 1

Open Access

Article

Publication date: 14 December 2021

Clustering as feature selection method in spam classification: uncovering sick-leave sellers

This paper aims to propose a novel way of using textual clustering as a feature selection method. It is applied to identify the most important keywords in the profile…

HTML

PDF (1.2 MB)

Downloads

776

Abstract

Purpose

This paper aims to propose a novel way of using textual clustering as a feature selection method. It is applied to identify the most important keywords in the profile classification. The method is demonstrated through the problem of sick-leave promoters on Twitter.

Design/methodology/approach

Four machine learning classifiers were used on a total of 35,578 tweets posted on Twitter. The data were manually labeled into two categories: promoter and nonpromoter. Classification performance was compared when the proposed clustering feature selection approach and the standard feature selection were applied.

Findings

Radom forest achieved the highest accuracy of 95.91% higher than similar work compared. Furthermore, using clustering as a feature selection method improved the Sensitivity of the model from 73.83% to 98.79%. Sensitivity (recall) is the most important measure of classifier performance when detecting promoters’ accounts that have spam-like behavior.

Research limitations/implications

The method applied is novel, more testing is needed in other datasets before generalizing its results.

Practical implications

The model applied can be used by Saudi authorities to report on the accounts that sell sick-leaves online.

Originality/value

The research is proposing a new way textual clustering can be used in feature selection.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

Access

Year

All dates (1)

Content type

Earlycite article (1)

1 – 1 of 1

Search results

Clustering as feature selection method in spam classification: uncovering sick-leave sellers

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Access

Year

Content type

Something didn’t work…

All feedback is valuable

Platform update page

Questions & More Information

Clustering as feature selection method in spam classification: uncovering sick-leave sellers

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information