Abstract:
As the number of users on social media rise, information creation and circulation increase day after day on a massive basis. People can share their ideas and opinions on these platforms. A social media microblogging site such as Facebook or Twitter is the favoured medium for debating any important event, and information is shared immediately. It causes rumours to spread quickly and circulates inaccurate information, making people uneasy. Thus, it is essential to evaluate and confirm the level of veracity of such information. Because of the complexities of the text, automated detection of rumours in their early phases is challenging. This research employs various NLP techniques to extract information from tweets and then applies various machine learning models to determine whether the information is a rumour. The classification is performed using three classifiers such as SVC (Support Vector Classifier), Gradient Boosting, and Naive Bayes classifiers for five different events from the PHEME dataset. Some drawbacks include limited handling of imbalanced data, difficulty capturing complex linguistic patterns, lack of interpretability, difficulty handling large feature spaces, and insensitivity to word order and context by using the above classifiers. The stacking approach is used to overcome the above drawbacks in which the output of combined classifiers is an ensemble with LSTM. The performance of the models has been analyzed. The experimental findings reveal that the ensemble model obtained efficient outcomes compared to other classifiers, with an accuracy of 93.59%.