A large-scale service system with packing constraints: Minimizing the number of occupied servers
01 June 2013
The model is motivated by the problem of efficient assignment of virtual machines to physical host machines in a network cloud, so that the number of occupied hosts is minimized. There are multiple input ows of different type customers, with a customer mean service time depending on its type. There is infinite number of servers. A server packing configuration is the vector k = fkig, where ki is the number of type i customers the server "contains". Packing constraints must be observed, namely there is a fixed nite set of configurations k that are allowed. Service times of different customers are independent; after a service completion, each customer leaves its server and the system. Each new arriving customer is placed for service immediately; it can be placed into a server already serving other customers (as long as packing constraints are not violated), or into an idle server. It was shown in [12] that a simple real-time algorithm, called Greedy, is asymptotically optimal in the sense of minimizing P k X1+ff k , ff > 0, in stationary regime, as the input ow rates grow to infinity. In particular, when parameter ff is small, Greedy algorithm approximately solves the problem of minimizing the number of occupied hosts P k Xk. In this paper we introduce the algorithm called Greedy with sublinear Safety Stocks (GSS), and show that it asymptotically solves the exact problem of minimizing P k Xk. The GSS algorithm is a simple as Greedy, and it uses no more system state information than Greedy does.