Self-Learning Visual Servoing of a Robot Manipulator Using Explanation-Based Fuzzy Neural Networks and Q-Learning
This thesis will show that the addition of Explanation-Based Fuzzy Neural Networks (EBFNN) to Q-learning improves the learning process of a self-learning visual servoing robot manipulator system. Two new self-learning visual servoing systems for robot manipulators are proposed based on the following methodologies: • Self-learning visual servoing of a robot manipulator using a Q-learning algorithm and fuzzy neural networks. • Self-learning visual servoing of a robot manipulator using EBFNN and a Q-learning algorithm. Both learning methodologies do not require robot or camera models, or calibration. These systems apply Q-learning to find the optimal policy using reinforcement learning. This policy is used by the robot to reach a predetermined object that has been randomly placed in the environment. In the first system the Q-learning algorithm is implemented using fuzzy neural networks to estimate the Q-evaluation function for each robot action. This system learns the optimal policy in order to select the best basic action that maximizes the cumulative reward received at each time step. Simulation results demonstrate the effectiveness of the system to learn the highly non-linear mapping between the continuous work-space and the optimal action policy. In the second system an analytical learning component is added to the induction learning. This system includes two main properties: on-line training and lifelong learning that are implemented by the Q-learning algorithm and the EBFNN respectively. It is demonstrated that the number of training samples, and therefore the training time for a specific robot positioning accuracy task, can be reduced using fuzzy explanation-based neural networks and the Q-learning algorithm. Background knowledge about the robot and its environment is transferred to the robot agent during the learning process using a set of neural networks which have been previously trained. The on-line learning and real-time performances of these two systems are compared and simulation results show the effectiveness of the EBFNN to improve the learning process and performance of the self-learning visual servoing system. The T-test and Wilcoxon-Mann-Whitney U test are used to justify the statistical significance of the results.