Abstract:
This paper describes a modified parallel Batcher sort algorithm for big data processing. The main novelty of implemented sort algorithm is to integrate effective parallel batcher sort and Active Storage concept. We use Active Storage based on Lustre File System and TSim C++ template library for parallelization. This paper presents experimental testing results for scientific processing real seismic data. Presented results indicate that described algorithm can reach linear acceleration on sorting big data sets (More then 100 Gb). (in Russian)
Key words and phrases:Parallel sort, Batcher sort, Big data processing, Active Storage, Distributed data processing.