• Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning

      Comşa, Ioan-Sorin; Zhang, Sijing; Aydin, Mehmet Emin; Chen, Jianping; Kuonen, Pierre; Wagen, Jean–Frédéric; University of Bedfordshire; University of Applied Sciences of Western Switzerland (Institute of Electrical and Electronics Engineers Inc., 2015-02-12)
      Maintaining a desired trade-off performance between system throughput maximization and user fairness satisfaction constitutes a problem that is still far from being solved. In LTE systems, different tradeoff levels can be obtained by using a proper parameterization of the Generalized Proportional Fair (GPF) scheduling rule. Our approach is able to find the best parameterization policy that maximizes the system throughput under different fairness constraints imposed by the scheduler state. The proposed method adapts and refines the policy at each Transmission Time Interval (TTI) by using the Multi-Layer Perceptron Neural Network (MLPNN) as a non-linear function approximation between the continuous scheduler state and the optimal GPF parameter(s). The MLPNN function generalization is trained based on Continuous Actor-Critic Learning Automata Reinforcement Learning (CACLA RL). The double GPF parameterization optimization problem is addressed by using CACLA RL with two continuous actions (CACLA-2). Five reinforcement learning algorithms as simple parameterization techniques are compared against the novel technology. Simulation results indicate that CACLA-2 performs much better than any of other candidates that adjust only one scheduling parameter such as CACLA-1. CACLA-2 outperforms CACLA-1 by reducing the percentage of TTIs when the system is considered unfair. Being able to attenuate the fluctuations of the obtained policy, CACLA-2 achieves enhanced throughput gain when severe changes in the scheduling environment occur, maintaining in the same time the fairness optimality condition.
    • Scheduling policies based on dynamic throughput and fairness tradeoff control in LTE-A networks

      Comşa, Ioan-Sorin; Aydin, Mehmet Emin; Zhang, Sijing; Kuonen, Pierre; Wagen, Jean–Frédéric; Lu, Yao; University of Bedfordshire; University of Applied Sciences of Western Switzerland (IEEE Computer Society, 2014-10-16)
      In LTE-A cellular networks there is a fundamental trade-off between the cell throughput and fairness levels for preselected users which are sharing the same amount of resources at one transmission time interval (TTI). The static parameterization of the Generalized Proportional Fair (GPF) scheduling rule is not able to maintain a satisfactory level of fairness at each TTI when a very dynamic radio environment is considered. The novelty of the current paper aims to find the optimal policy of GPF parameters in order to respect the fairness criterion. From sustainability reasons, the multi-layer perceptron neural network (MLPNN) is used to map at each TTI the continuous and multidimensional scheduler state into a desired GPF parameter. The MLPNN non-linear function is trained TTI-by-TTI based on the interaction between LTE scheduler and the proposed intelligent controller. The interaction is modeled by using the reinforcement learning (RL) principle in which the LTE scheduler behavior is modeled based on the Markov Decision Process (MDP) property. The continuous actor-critic learning automata (CACLA) RL algorithm is proposed to select at each TTI the continuous and optimal GPF parameter for a given MDP problem. The results indicate that CACLA enhances the convergence speed to the optimal fairness condition when compared with other existing methods by minimizing in the same time the number of TTIs when the scheduler is declared unfair.
    • Towards 5G: a reinforcement learning-based scheduling solution for data traffic management

      Comşa, Ioan-Sorin; Zhang, Sijing; Aydin, Mehmet Emin; Kuonen, Pierre; Lu, Yao; Trestian, Ramona; Ghinea, Gheorghiţă; Brunel University; University of Bedfordshire; University of the West of England; et al. (IEEE, 2018-08-06)
      Dominated by delay-sensitive and massive data applications, radio resource management in 5G access networks is expected to satisfy very stringent delay and packet loss requirements. In this context, the packet scheduler plays a central role by allocating user data packets in the frequency domain at each predefined time interval. Standard scheduling rules are known limited in satisfying higher Quality of Service (QoS) demands when facing unpredictable network conditions and dynamic traffic circumstances. This paper proposes an innovative scheduling framework able to select different scheduling rules according to instantaneous scheduler states in order to minimize the packet delays and packet drop rates for strict QoS requirements applications. To deal with real-time scheduling, the Reinforcement Learning (RL) principles are used to map the scheduling rules to each state and to learn when to apply each. Additionally, neural networks are used as function approximation to cope with the RL complexity and very large representations of the scheduler state space. Simulation results demonstrate that the proposed framework outperforms the conventional scheduling strategies in terms of delay and packet drop rate requirements.