Abstract:
This paper examines the application of weak observation techniques to automate tax claim processing in the banking sector. Interest in process automation using machine learning and artificial intelligence techniques in the financial sector has grown significantly in recent years, driven by the desire to improve efficiency, accuracy and customer service. Previous research in financial process automation has often relied on traditional machine learning approaches that require large amounts of well-analyzed data. However, in the banking industry, and especially in specific tasks such as processing tax authority claims, data annotation faces significant challenges due to the need for highly skilled professionals and privacy issues. Our work therefore aims to fill the research gap by applying weak observation, a technique that allows the use of imprecise, inconsistent or incomplete data to train models. This is particularly relevant for the banking sector, where data become outdated quickly and often have limited access due to regulatory constraints. Methodologically, to implement the idea of weak supervision, we used the Snorkel framework to create a training dataset using markup functions developed together with Bank Point experts. This allowed us to significantly reduce the dependence on the time-consuming process of manual data markup and to use large volumes of unlabeled documents. The results of the study showed that weak control approaches can significantly improve the efficiency of tax claims processing by creating models capable of classifying and interpreting different types of tax documents with high accuracy. In addition, the application of weak control can accommodate the need for constant updating of data and legislation, making it preferable for the dynamically changing environment of the financial sector. Using weak supervision to automate responses to tax claims not only improves the quality of data processing, but also helps to reduce the workload of specialists, improving the overall efficiency of financial operations. These results could have an impact on the future application of machine learning in the financial sector, given the importance of innovative approaches in data-constrained environments.