Abstract:
Non-topical text classification is widely used in modern applications.
One of the issues related to this problem is the presence of biases and shifts in
the distribution in the training text datasets. The most significant type of shift is
the topical shift. To handle this issue we apply competitive methods such as
Adversarial Domain Adaptation, Energy-based ADA, BERT with contrast loss
function, ADA with contrast loss function.
In this paper, we first modify the contrast loss function to reduce the influence
of thematic shifts and show that the use of adversarial methods improves the
accuracy and reliability of classifiers for the task of determining the gender
of the author of a text. We also apply LLaMA-3B and show that the large
language models attain lower accuracy in the few-shot mode and require more
time for prediction than the pre-trained models based on smaller architectures.
Key words and phrases:adversarial methods, contrastive loss, gender
classification, text classification, non-topical classification, bert, domain adaptation.