Abstract:
Nowadays, the development and use of workflow-based applications (distributed applied software packages) are some of the key challenges in terms of preparing and carrying out large-scale scientific experiments in distributed environments with heterogeneous computing resources. The environment resources can be represented by clusters of personal computers, supercomputers, and private or public cloud platforms and differ in their computational characteristics. Moreover, the composition and characteristics of resources change in dynamics. Therefore, computations planning and resource allocation in the considered environments are important problems. In this regard, we propose new algorithms for computation planning taking into account redundancy and uncertainty in such distributed applied software packages. Compared to other algorithms of a similar purpose, the proposed algorithms use evaluations of workflow execution makespan obtained in the process of continuous integration, delivery, and deployment of applied software. The proposed algorithms provide the construction of redundant problem-solving schemes that allow us to adapt them to the dynamic characteristics of computational resources and improve distributed computing reliability. The algorithms are based on a theory of conceptual modeling computational processes. We demonstrate the process of constructing problem-solving schemes on model examples. In addition, we show the utility in using redundancy for increasing the distributed computing reliability In comparison with some traditional meta-schedulers.