Abstract:
The problem of dispatching to heterogeneous servers, operating independently in parallel, is considered. Each server has a single processor and a dedicated FIFO (first in, first out) queue of infinite capacity. Homogeneous jobs (without preceding constraints) arrive one by one to the dispatcher which immediately makes a routing decision. Both jobs interarrival times and their sizes are assumed to be independent and identically distributed random variables with general distributions. Upon making a decision, full information about the current system's state, including the arriving job size, is available to the dispatcher. The problem is to minimize the long-run system's mean response time. A new sample-path-based policy gradient algorithm is proposed which allows one to construct such a policy. Its main ingredients are the dynamically changing discretization of the continuous state space and individual policy gradient algorithms acting in each cell. Simple numerical examples are given which demonstrate that the new algorithm can outperform best known solutions and is applicable in quite general cases.
Keywords:heterogeneous parallel queues, Markov chains with continuous state space, sojourn time optimization.