Spam email detection is a research hotspot, and the most efficient detection method is based on deep learning. In the context of the extensive use of pre-trained word vectors in deep neural networks, this paper studies the impact of pre-trained word vector models on the Text-CNN-based spam classification model, and uses token granularity matching technology to optimize the word2vec pre-trained word vector model in the vector representation on the spam email.
By comparing the accuracy and time complexity of the spam classification with or without token granularity matching, it can be concluded that the Word2Vec pre-trained word vectors combined with token granularity processing can improve the performance of the Text-CNN model on spam email classification.