2016-09-01

Artificial Neural Networks on Quadrotor Control

In summer of 2016 I had the privilege to work in Dynamic Systems Lab at UofT Institute of Aerospace Studies. I worked in a project group consisting of four undergraduate students which is supervised by two graduate students and Professor Angela Schoellig.
Throughout the summer we worked on finding possible applications of deep learning. There are a series of projects, which by the end of 16 weeks result in a paper submitted to IEEE International Conference of Robotics and Automation. (Updated 2017-01-22: The paper was accepted.)
This blog introduces the background, methodology, procedure, and experimental results of this project.

Background

As the development of Robotics technology, unmanned vehicles can perform tasks in complex systems. One of the most important tasks is trajectory following (path tracking). Currently, such tasks are done by controllers, and its working diagram can be illustrated as following.
Controller Diagram
A traditional controller usually takes in the difference of target and current state, and outputs an action to the environment. As an analogy, a controller is a driver who reacts to the environments.

However, traditional controllers have inherent error, including:

Limited precision.
Steady state error.
Limited predictability.

Other researchers have been using machine learning to come up with better controllers.

Niklas et al uses Reinforcement Learning to learn a closed-loop end-to-end controller from pixel information (pixel to torque).
Ali et al uses ReLU feedforward neural network to learn a controller model which outperforms traditional models.
Mahler et al uses Gaussian Process Regression to manipulate input data of controller on cable-driven surgical robots.

Artificial Neural Network
Of all the machine learning techniques, Artificial Neural network is of our particular interest. It is a set of powerful learning algorithms. It models multiple layers of neural networks in human brains. A vectorized input, $\vec{x}$, is passed into the neural network, on which a set of operations are done as:

$\vec{x}_{i+1} = f(W_i\vec{x}_i+\vec{b}_i)$

where $i$ represents the current layer number, and $f$ is the activation function (eg: ReLU, softmax, etc.). When training, we adjust $W$ and $\vec{b}$ values to minimize a loss function $J$, so that:

$W, b = argmin_{W, b}{J}$

Simple as it looks like, artificial neural network can model complex mappings and find abstract connections between data sets. For example, in our case it can learn how to tweak the reference signal to compensate for the controller’s internal error. As another example, in Natural Language Processing, neural network can be used to map word to vectors, in which semantic similarities can be efficiently estimated by cosine distances of the vectors.

Methodology

We propose a different method. A fully-connected, ReLU-activated neural network manipulates reference trajectories passed into the classical controller, which compensates the error yet maintains the generalizability of classic controller. We then apply the complex controller to ground and quadrotor environments respectively and tested the accuracies.
Our method's diagram

Procedure

We first looked for Deep Learning algorithms and their applications to transfer learning. Among all the toolboxes, Tensorflow stood out and became our choice. Among all the possible transfer learning algorithms, we chose to propose a new method to control unmanned vehicles to precisely follow arbitrary trajectories in a given complex system, and demonstrate generalizabilty (that the experience learned in a system can be generalized to all possible tasks) instead of transferability (that the experience learned in a system can be transferred to a different system).
Then we narrowed down to using feed-forward neural networks to perform supervised learning on the reference trajectories of unmanned vehicles, which is the target of the research group.
We experimented on various neural network architectures on different environments. I implemented a 2D point environment with acceleration feature, and applied neural network control to a 3D ROS quadrotor simulator environment.
We also experimented on data generation methods to train neural network. I experimented on using Iterative Learning Control to generate desired-versus-actual trajectory sets, with desired trajectories themselves generated by parameterizing arcs on circles on 2D point environment.
Later we experimented on methods to improve precision and demonstrate generalizability. More specifically, by gathering hand-drawn trajectories from a web application, preprocess it, feed into the neural network, and record the improvements in tracking precisions. After each quadrotor flight, the results were automatically passed back to the web app. I implemented the web application (DSLPath) with a LAMP architecture.

Results

Currently the feedforward neural network yields up to 80% improvements on simulator and 40% on quadrotor environments on arbitrary testing trajectories.
Two examples of results

Acknowledgements

Thanks ESROP for supporting. It is an unforgettable experience being able to contribute to a real research project in a lab environment with a group of excellent undergraduate (Qiyang Li, Jingxing Qian, Xuchan Bao) and graduate students (Chris, Mohamed, etc.), and with the supervision of Professor. This project can never be done without the collaborative work.
A long exposure picture showing quadrotor path following. Photo credit to Jingxing

Gains from Experience

Basic knowledge on artificial neural network and machine learning toolboxes (Tensorflow)
Experience on implementing collaborative software project using Robot Operation System
Experience in proposing new solutions to problems