Estimating Unmodeled Physical Phenomena by Combining Deep Learning and Physics Based Models with Application on an Acrobot

(1)

Models with Application on an Acrobot

Combining Deep Learning and Physics Based

Estimating Unmodeled Physical Phenomena by

Academic year 2019-2020

Master of Science in Electromechanical Engineering

Master's dissertation submitted in order to obtain the academic degree of

Counsellors: Ir. Tom Staessens, Ir. Wannes De Groote

Supervisor: Prof. dr. ir. Guillaume Crevecoeur

Student number: 01502133

(2)

(3)

Models with Application on an Acrobot

Combining Deep Learning and Physics Based

Estimating Unmodeled Physical Phenomena by

Master of Science in Electromechanical Engineering

Master's dissertation submitted in order to obtain the academic degree of

Counsellors: Ir. Tom Staessens, Ir. Wannes De Groote

Supervisor: Prof. dr. ir. Guillaume Crevecoeur

Student number: 01502133

(4)

Permission of Usage

The author gives permission to make this master’s dissertation available for consultation and to copy parts of this master’s dissertation for personal use. In the case of any other use, the copy-right terms have to be respected, in particular with regard to the obligation to state expressly the source when quoting results from this master’s dissertation.

(5)

Preface

I would like to thank my supervisors, ir. Tom Staessens, ir. Wannes De Groote and prof. dr. ir. Guillaume Crevecoeur. Without their good advice and enthusiasm this dissertation would not have been possible. They learned me a lot of valuable skills.

Furthermore I wish to thank Tony Boone. Without his excellent technical skills the acrobot would have remained a technical drawing.

My friends certainly deserve to be mentioned as well. Having a laugh with them, be it in real life or online, helped me to keep up the spirit to finish the dissertation.

Last but not least, special thanks to my family. They encouraged me in the moments I needed it the most. My parents gave me this beautiful opportunity to study, for which I am very grateful. I would like to thank my sister and brother-in-law for proofreading this dissertation.

(6)

Estimating Unmodeled Physical Phenomena

by Combining Deep Learning and Physics Based

Models with Application on an Acrobot

by Simon De Meester

Master’s dissertation submitted in order to obtain the academic degree of Master of Science in Electromechanical Engineering

Supervisor: Prof. dr. ir. Guillaume Crevecoeur

Counsellors: Ir. Tom Staessens, Ir. Wannes De Groote Faculty of Engineering and Architecture

Ghent University Abstract:

For a lot of mechatronic systems the quality of control directly depends on the quality of the system’s model, especially for highly dynamical systems. For the advanced model predictive control (MPC) technique, a good model is essential. Furthermore, there are often physical phe-nomena at play that remain unmodeled because they are not well understood or not known. Models derived from physics first principles do not cope well with these kinds of phenomena. Purely data driven techniques on the other hand have proven their ability to unravel complex systems in the past few years, due to the rise in computing power. However they sometimes lack interpretability and generalisation capability, which is not ideal to develop a robust model of the system. In this master’s dissertation it is investigated if combining deep learning and physics based modeling can help to provide an accurate model for a mechatronic system that contains unmodeled physical phenomena. The subject of study is the acrobot: an underactuated double pendulum. As most mechatronic systems with rotating joints the acrobot suffers from a very important unmodeled physical phenomenon: friction. The first chapter contains an introduction to hybrid models. Chapter 2 is about the theory behind neural networks. In Chapter 3 a physics based acrobot model is developed. It gives an overview of the existing friction models as well. Chapter 4 is devoted to the actual construction of the acrobot. In Chapter 5 a hybrid model that combines deep learning and physics based modeling is proposed. Finally, the results and conclusions are presented in Chapter 6.

(7)

1

Estimating Unmodeled Physical Phenomena by

Combining Deep Learning and Physics Based

Models with Application on an Acrobot

Simon De Meester

Supervisors: prof. dr. ir. Guillaume Crevecoeur, ir. Tom Staessens and ir. Wannes De Groote

Abstract- This article presents two methods to identify the friction process in the joints of an acrobot. An acrobot is a double pendulum where only the second joint is actuated. Both are hybrid methods, as they combine physics based state space equations and neural networks (NNs) to estimate the friction. In the first approach friction is modeled as being statically dependent on the velocity of the joints. With the second method it is investigated if a dynamic model is able to further improve performance. The models are validated on an experimental setup, where a vision based sensing method is used.

Keywords- Nonlinear system identification, friction, hybrid models, neural networks.

I. INTRODUCTION

Most mechatronic systems need an accurate model in order to be well understood and controlled, especially in high-tech applications such as robotics. For years the only modeling approach available was to derive the system’s equations from physics first principles. The disadvantage of this method is that such models do not cope well with unmodeled or not fully understood phenomena. An example of this is friction, a phenomenon that appears in all moving mechanisms. There exists a large variety of complex friction models [1], but people tend to use (over)simplified models as theoretical models are typically not very general. In the last few years breakthroughs in computing power boosted usage of data driven techniques in system identification. Techniques as Gaussian processes, Hammerstein-Wiener modeling and NARMAX (nonlinear au-toregressive moving average with exogenous input) have proven their worth. Another data driven technique that has gained a lot of attention is NN modeling, as it has shown a lot of potential in different domains. However, the method is not ideal to fully describe a dynamical mechatronic system: NNs do not generalize well for data that was not seen during training. In this article purely data driven and physics based methods are combined to get rid of their specific disadvan-tages. A hybrid model that is able to deal with the friction in the acrobot joints is developed.

II. ACROBOTSYSTEM

A symbolic representation of the acrobot can be seen in Fig. 1, the angle conventions indicated here are followed. Underactuation means that there are fewer actuators than

degrees of freedom. In the acrobot case only the second joint is actuated.

Fig. 1: Angle convention acrobot.

The physics based part of the model is derived by evaluating the Euler-Lagrange equation.

d dt( ∂L ∂ ˙qi )₋ ∂L ∂qi = Q (1) L is the Lagrangian of the system, defined as the difference between kinetic (T ) and potential (V ) energy, L = T − V . In the case of the acrobot, the generalized coordinates (qi)

are the angles θ1 and θ2 as defined in Fig. 1. Q represents

the generalized non-conservative forces acting on the system. These forces comprise the motor torque and the friction torques acting on both joints. The different parts of the real acrobot setup are simplified as elementary geometric shapes. In this way a state space function for the acrobot system is defined.

III. ACROBOTLABSETUP

The finished experimental setup is depicted in Fig. 2. The links are fabricated in aluminum. As actuator a lightweight brushless DC (BLDC) motor was chosen. The BLDC allows torque control. The motor is located on a rotating joint and thus a slip ring system is needed to prevent the electrical wires from getting tangled. Instead of encoders to measure angles and angular velocity, a vision based system was opted for. A camera measures both angles at a sample frequency of 60 Hz. The vision algorithm calculates the angles based on the location of two colored dots that are fixed onto the acrobot. Afterwards angular velocities are obtained by discrete differentiation of the angles. This sensing method showed to be able to accurately track the acrobot angles. The setup has been designed in view of modularity and simplicity for further research. The links are interchangeable and a system to vary the inertia of the second link is provided.

(8)

2

Fig. 2: The finished acrobot lab setup. IV. HYBRIDFRICTIONMODELS

By evaluating (1), a discretized state space function which corresponds to the forward Euler integration scheme can be found. θ1k+1 θ2k+1 = θ1k θ2k + ˙θ_˙θ1k 2k ∆t (2) ˙_θ_1k+1 ˙θ2k+1 = ˙θ_˙θ1k 2k +A−1 f1(θ1k, θ2k, ˙θ1k, ˙θ2k)− Tf 1 f2(θ1k, θ2k, ˙θ1k, ˙θ2k)− Tf 2+ T ∆t (3) Here the subscript k is used as discrete time index and ∆t represents the time between two samples (1/60th of a second). Ais a 2x2 matrix found by evaluating the Lagrangian equation. T is the motor torque, while Tf 1and Tf 2stand for the friction

torque in the joints. In [2] a method is proposed to incorporate NNs into a state space function to estimate an unmodeled physical phenomenon in a mechatronic system. Here a similar method is proposed to estimate the friction torques by a NN. As experiment 25 acrobot trajectories are measured. The input torque signals cover a wide range of different wave forms to reach as much acrobot states as possible. In total there are 8575 samples. The NNs are implemented in Python using the Keras functional API [3] with TensorFlow backend [4]. In this way mathematical formulas do not need to be hard coded. TensorFlow uses automatic differentiation to calculate the gradients analytically.

A. Static Friction Model

In a static friction model, the friction torque is assumed to be only dependent on the joint velocities [1]. The hybrid static model is symbolically shown in Fig. 3.

Fig. 3: Static hybrid model.

Two parallel, feedforward NNs independently estimate the two friction torques based on the angular velocities at t = k. Those estimates are then fed into the state space equation (3) to predict the angular velocities at t = k+1. The reason why only the velocities and not the angles are predicted is explained by (2): the predicted angles do not depend on the NN predictions. The estimates made by the model are indicated with the letter η. It is possible to incorporate the state space equation as a Lambda layer in Keras, as the equations are differentiable. The training process of a NN is based on backpropagation and thus a loss function has to be provided.

E = 1 N N X j=1 ||ej||2= 1 N N X j=1 q ( ˙θ1j− ˙η1j)2+ ( ˙θ2j− ˙η2j)2 (4) The loss [rad/s] is defined as the mean of the 2-norm of the error vector over all samples. The details on how the NN architecture and hyperparameters were chosen is not discussed here but can be consulted in the full dissertation. To test the model, 10-fold cross validation is used. When the training process is finished, the weights of the two NNs are frozen. The NN part of the model is then separated from the state space part and given a range of input velocities. In this way it is possible to visualize the learned friction curves (Fig. 4 and Fig. 5).

Fig. 4: Learned friction curves link 1.

Fig. 5: Learned friction curves link 2.

Each fold (indicated in red) a slightly different friction law is learned. There is quite a lot of variance in the curves because of the limited dataset. The mean curve is indicated in black.

(9)

3

The curves of the second joint show some typical static friction features. For example, there is a steep slope in the curve near the origin which corresponds to the breakaway torque. For higher velocities the friction torque increases slowly, this can be seen as a viscous effect. At the highest speeds the curve becomes nearly horizontal. This behavior does not correspond with the physics of friction, one would expect the friction to increase linearly. This is possibly explained by the shortage of high velocity training samples. In most of the folds some Stribeck effect was detected. The Stribeck effect is the phenomenon where the friction first decreases in the low velocity region and then increases again. Notice that the Stribeck ’dip’ is more pronounced for negative velocities.

The typical steep slope around the origin is less pronounced in the curve of the first link. However, here there are some clear notions of a combination of Coulomb and viscous friction as well. No Stribeck effect was detected. As was expected the friction of the first joint is much higher than that of the second. Both curves are more or less symmetrical, which is good for the physical interpretability as this is expected friction behavior.

B. Dynamic Friction Model

In literature [1], it is stated that friction should be thought of as a dynamic phenomenon. An important dynamic aspect is frictional hysteresis, or also called friction lag. In the case of lubricated friction the phenomenon can be explained by the time it takes to modify the thickness of the lubricant film. Frictional hysteresis is encountered in dry friction as well. To deal with this dynamic aspect, the architecture of Fig. 6 is proposed.

Fig. 6: Dynamic hybrid model.

Instead of two single velocities, a series of two times n velocities is given to the NN. This represents the history of the velocities. A length of n = 9 samples was used in the final implementation. To memorize past velocities a long short-term memory (LSTM) NN was used, a type of recurrent neural network (RNN). The LSTM has gained a lot of popularity recently, as it has some important advantages compared to a simple RNN [5]. As with the static model, the dynamic hybrid model is used to perform one step predictions. The dataset and loss function are the same as before. Compared to the static

model, physical interpretability is not evident. It is not possible to map all possible inputs to their outputs in a structured way due to the dynamic nature. Despite this, it is interesting to investigate if the converged model shows hysteretic behavior. To test this, a steady state sinusoidal velocity is given to the first link: ˙θ1 = 5· sin(πt). To avoid a cluttered figure, only

one of the 10 cross validation models is depicted.

Fig. 7: Hysteretic behavior of the friction torque if link 1 is subjected to a sinusoidal velocity profile.

It is clear that the NN learned to exhibit hysteretic behavior, as there are loops in the curve. The sense of rotation is indicated with black arrows. The general shape of the curve is similar to the shape of a static friction curve.

Fig. 8: Hysteretic behavior of the friction torque if link 2 is subjected to a sinusoidal velocity profile.

The response of the second joint is investigated with a sinusoidal signal as well: ˙θ2= 10· sin(2πt). The behavior is

totally different from the first joint. There is only one, much wider hysteresis loop. It is important to note that the curve does not pass through the origin: friction at zero velocity is not equal to zero. This is an important dynamic aspect. Another observation is that the size of the hysteresis loop is proportional to the frequency of the sinusoidal signal. C. Discussion of Results

For the static model, there is no direct validation of the learned friction curves available and thus one step prediction loss is taken as indicator of how well the friction curve is estimated. One can imagine a lot of situations and applications where it is not possible to directly measure the static friction

(10)

4

curve of a joint. The proposed method could be used in those cases. Also for the dynamic method no direct validation is available.

A disadvantage to the method is that the prediction loss is not only caused by errors in the friction curves. Other sources of errors are:

• Errors in the state space model, e.g. a wrong moment of

inertia.

• Errors caused by the simple numerical integration scheme

and large step size.

Moreover, it is possible that the learned friction curves com-pensate for a wrongly measured parameter and thus do not represent the actual friction curves.

A comparison between the baseline model (=the state space model with no friction component included), the static hybrid model and the dynamic hybrid model is shown in Fig. 9.

Fig. 9: Boxplots with the results of 10-fold cross validation. For each model 10-fold cross validation was performed on the same dataset. The mean loss of the baseline model is 0.61 rad/s. The hybrid static model has a mean loss of 0.44 rad/s and finally the hybrid dynamic model improves the mean loss to 0.37 rad/s. In each fold of the cross validation this order was respected. Compared to the baseline model, the hybrid methods perform a lot better. However, multi step state prediction showed to be no option. Due to modeling errors, a large stepsize and the integration method the predicted states diverge from the real trajectories after a few steps.

V. CONCLUSION

A working acrobot setup has been constructed, focused on simplicity and modularity. Offline tracking of the angles by us-ing a camera as sensor showed to be possible as well. With this vision based sensing method, two hybrid methods to identify the friction in the joints are proposed: a static and a dynamic model. The dynamic model slightly outperforms the static model but does not possess a clear physical interpretability.

The proposed method may offer a solution for situations where direct measurement of the friction curves is not possible. As only a camera is needed for measuring, the method is suitable for health monitoring of systems during operation. Fact is that the friction curve might change over time due to wear and environmental causes. Therefore health monitoring is essential. In the classical approach the mechanism would have to be disassembled, as letting the motor rotate at steady state

speed can be potentially dangerous or simply not possible. In the hybrid model it is not necessary to disassemble the mechanism, the curves can be obtained just by observing.

REFERENCES

[1] H. Olsson, K. J. ˚Astr¨om, C. C. De Wit, M. G¨afvert, and P. Lischinsky, ”Friction models and friction compensation,” Eur. J. Control, vol. 4, no. 3, pp. 176-195, 1998.

[2] W. De Groote, E. Kikken, S. Goyal, S. Van Hoecke, and G. Crevecoeur, ”Hybrid derivative functions for identification of unknown loads and physical parameters with application on slider-crank mechanism,” in 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM). IEEE, 2019, pp. 1049-1054.

[3] F. Chollet et al., ”Keras,” https://keras.io, 2015.

[4] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., ”Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, 2016, pp. 265-283.

[5] S. Hochreiter and J. Schmidhuber, ”Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.

(11)

List of Abbreviations & Symbols

Adam adaptive moment estimation BLDC brushless direct current

DL deep learning

EMF electromotive force

FPS frames per second

HSV hue, saturation, value LSTM long short-term memory

MLP multilayer perceptron

MPC model predictive control

NN neural network

PGNN physics-guided neural network ReLU rectified linear unit

RGB red, green, blue

RMSE root-mean-square error RNN recurrent neural network

SGD stochastic gradient descent ∆ dimensionless deviation (prefix) θ1 angle first joint [rad]

θ2 angle second joint [rad]

˙θ1 angular velocity first joint [rad/s]

˙θ2 angular velocity second joint [rad/s]

f frequency [Hz]

H height [m]

J moment of inertia [kg · m2_]

k discrete time index

L/l length [m]

M /m mass [kg]

R/r radius [m]

T motor torque [Nm]

Tf 1 friction torque first joint [Nm]

Tf 2 friction torque second joint [Nm]

t time [s]

W width [m]

(16)

List of Figures

2.1 Symbolic representation of the MLP architecture. . . 23

2.2 Common activation functions used in NNs. . . 25

2.3 Illustration of 5-fold cross validation. Training set indicated in grey while the validation set is orange. . . 26

2.4 Symbolic representation of a RNN architecture. . . 28

3.1 Angle convention acrobot. . . 29

3.2 Schematic side view of the acrobot setup. . . 30

3.3 Idealized version of a Stribeck curve. . . 36

3.4 Hysteretic behavior of friction, with velocity on the x-axis and friction force on the y-axis. The size of the hysteresis loop increases with excitation frequency. . . 37

4.1 The finished acrobot lab setup. . . 39

4.2 Working principle of slip rings. . . 41

4.3 Working principle and construction of the BLDC. . . 43

4.4 Operating Range BLDC. . . 43

4.5 Example of an image taken by the camera, with indication of the three points that are used to calculate the angles. . . 44

4.6 Median blur filter. The central pixel gets replaced with the median of the window. 45 4.7 Example of measurements obtained with the proposed sensing method. . . 46

(17)

16 LIST OF FIGURES 4.8 Illustration of the image processing algorithm for P3. . . 47 4.9 Torque command versus measured KtI. . . 48

5.1 Static hybrid model. Based on the velocities at t = k, the NNs predict the friction torques that are fed into the state space function to predict the velocities at the next timestep. . . 50 5.2 Scatter plot of all the simulation velocities. Each dot represents a sample with

its own state values. . . 52 5.3 The two artificial Stribeck curves used for simulation. . . 53 5.4 Learning curve of the static hybrid model on simulation data. . . 54 5.5 Comparison between the hybrid static model and the state space model without

friction, in simulation. The boxplots represent the loss values obtained by 10-fold cross validation. . . 54 5.6 Comparison between the Stribeck curve used for simulation and the learned curve. 56 5.7 Barplot indicating how prediction loss changes if a parameter is estimated wrongly. 57 5.8 Barplot indicating how the RMSE of the learned Stribeck curve changes if a

parameter is estimated wrongly. . . 57 5.9 Stribeck curve link 1 with overestimated motor mass m2. . . 58

5.10 Scatter plot of all the experimental velocities. Each dot represents a sample with its own state values. . . 59 5.11 Architecture of both NN branches of the hybrid static model. . . 60 5.12 Example of a learning curve with experimental data. . . 61 5.13 Comparison between the hybrid static model and the baseline model on

experi-mental data. The boxplots represent loss values obtained by 10-fold cross validation. 62 5.14 Resulting Stribeck curves of the 10-fold cross validation, indicated in red. The

mean of all 10 models is shown in black. . . 62 5.15 Examples showing that the hybrid model is not suitable for multi step prediction. 63

(18)

LIST OF FIGURES 17 5.16 Dynamic hybrid model. Based on the velocities at t = k and the previous n

steps, the LSTM NNs predict the friction torques that are fed into the state

space function to predict the velocities at the next timestep. . . 64

5.17 Architecture of both NN branches of the hybrid dynamic model. . . 65

5.18 Comparison between the baseline, static hybrid and dynamic hybrid model. . . . 66

5.19 Hysteresis behavior when the joints are given a steady state sinusoidal velocity excitation. . . 68

5.20 The width of the hysteresis loop increases with increasing excitation frequency. . 68

B.1 Torque Signals . . . 80

B.2 Angle Tracking Link 1 . . . 83

(19)

List of Tables

I Abbreviations (bold) and symbols (italic). . . 14

II Parameters supporting shaft. . . 31

III Parameters clamp slip ring. . . 31

IV Parameters clamping bush. . . 31

V Parameters link 1. . . 32

VI Parameters motor assembly. . . 32

VII Parameters link 2. . . 32

VIII Parameters extra mass. . . 33

IX Metrics velocity distribution simulation. . . 52

X Metrics velocity distribution experiments. . . 60

XI Results hyperparameter tuning. Number of neurons is always a discrete value. . 60

XII Results hyperparameter tuning dynamic model. Number of neurons and n, length of the input series, are always discrete values. . . 66

XIII Bill of materials . . . 78

(20)

1

Introduction

Mechatronic systems are very important and omnipresent in today’s society and industry. In order to fully use their potential, good (dynamic) models are required. One of the first things that comes to mind is the control aspect. For instance to let a robot arm follow a trajectory extremely precisely, an accurate model of the system is required [1]. Both in classical control theory [2] and in more advanced techniques, such as model predictive control (MPC) [3], an adequate plant model is necessary to avoid errors. A good model is also essential for product design, in order to be able to predict how the machine or system will react under different conditions. For example, it is useful to simulate the temperature response of the inverter in an electric vehicle under different conditions before actually producing the power pack [4]. During the last few years usage of ’digital twins’ has gained popularity. A digital twin is a virtual, realistic simulation of a system. The digital twin needs (virtual) sensor information and a good model of the system to function properly. The procedure is particularly useful for monitoring the health of systems where access is restricted [5]. In general it is not easy to obtain a good model. A mechatronic system usually contains unmodeled and highly complex phenomena, such as actuator nonlinearities [6]. The conclusion is that there is a high need to improve existing modeling techniques and test new techniques in order to get higher model accuracy.

Dynamic systems are encountered in almost all science and engineering disciplines: heat transfer, fluid flow, electromagnetics, robotics... The oldest way to develop a dynamic model is to study the laws of physics governing the system. In most cases the dynamic model can be derived

(21)

20 CHAPTER 1. INTRODUCTION from some form of energy balance. An example is the use of Lagrangian mechanics in robotics, for instance to solve the inverse dynamics of a manipulator [7]. An alternative to the energy based Euler-Lagrange model is the Newton-Euler method, which leads to completely equivalent solutions. A disadvantage of such methods is the fact that it is hard to accurately model non-conservative forces such as friction. Another drawback of purely physics based modeling is that the model parameters must be measured very precisely, which is not always possible. As a consequence dynamic modeling is often the domain of experts [8]. Furthermore, purely analytical models do not cope well with disturbances and unmodeled phenomena. Analytical models are however very useful to understand the system’s behavior in a more qualitative way. Another possibility is to develop a purely data driven, black box model. In this approach only a set of system inputs and corresponding observations is used to create the model. The goal is to find a set of parameters that minimizes the model’s prediction error. There are a lot of different data driven methods that can be used for nonlinear system identification. One of them is Gaussian process modeling. This is a nonparametric, probabilistic approach used to identify complex systems. As shown by [9] the method can also be used to tackle dynamic problems. Another way to model nonlinear dynamics are Hammerstein-Wiener methods. Here linear, dynamic and nonlinear, static ’blocks’ are combined to simulate the system [10]. NARMAX (nonlinear autoregressive moving average with exogenous inputs) modeling is another method in nonlinear system identification that has proven to be successful in multiple applications [11]. Last but not least, neural networks (NNs) have gained a lot of attention in the last decade. This is mostly due to an increase in computing power combined with groundbreaking results in different domains, such as image classification [12] and natural language processing [13]. The usage of NNs has found its way into (nonlinear) system identification as well [14]. Data driven models are very flexible and able to capture complex behavior. However, the quality of the data driven model depends on the training set it is given. Moreover for some applications, the lack of explainability and extrapolation capability might be a problem [15].

In order to eliminate the particular disadvantages of both physics and data driven models, hybrid models might offer a solution. A hybrid model is a combination of data driven and physics based aspects, aimed at improving performance. For instance in [16] a dual unscented Kalman filter is applied on a passenger vehicle model. The filter is used to identify both vehicle state and inertial parameters at the same time by combining sensor measurements and system knowledge. The inertial parameters are variable as passenger and fuel tank mass may change. Another example of a hybrid model that has to deal with unmodeled phenomena can be found in [17]. This paper handles a fully actuated autonomous underwater vehicle that has to deal with external disturbances and actuator nonlinearities. Here an adaptive NN control for the tracking problem is developed by using reinforcement learning.

(22)

21 models and vice versa. The main goal is to find out if such hybrid models can outperform purely data driven or physics based models. The term physics-guided neural network (PGNN) is introduced in [18] to denote hybrid models where a NN is included. In PGNNs either the data driven part or the physics part can be dominant. For [18] the first case holds, the authors of the article incorporate physical guidance into the loss function of the NN in the form of an extra term. This term punishes predictions that violate known physical laws. The loss term denoting violation of physics does not depend on labeled data, which greatly improves generalisation ability. The other approach, where physics dominates, is followed in [19]. In this article a slider crank mechanism with some unknown physical parameters is studied. There is an unknown, state dependent load acting on the slider as well. The unknown load consists of a spring force combined with friction force, thus there is a static relationship between the load and the state variables (speed and displacement of the slider). A NN, with the state variables as input, feeds the predicted load to a state space function that predicts the next state. Based on how well the next state is predicted, one has information on the quality of the load prediction. In this manner the unknown load can be identified. In this case a purely data driven model would not be suitable, as the dynamics are very complex and contain a lot of parameters. The knowledge of the physics (incorporated into the state space equations) makes it easier for the data driven part to detect the part of the system that is not well understood. The use of statistical methods to estimate the parameters of robotic systems is commmon practice nowadays. Most of the time, linear regression is used. To achieve physical plausibility of parameter estimations, [20] proposes a framework using linear matrix inequalities. Reference [21] proposes a hybrid method, called deep Lagrangian networks, to learn an inverse dynamics model for robot control. Where in classical approaches the parameters of the robot would be measured or estimated carefully to construct a dynamic model, here the parameters are learned by a NN. Again a physics prior is used to facilitate learning: the authors impose Lagrangian mechanics on the NN.

The main goal of this master’s dissertation is to investigate the potential of hybrid system iden-tification on a mechatronic system containing unmodeled physical phenomena. The technique employed in the data driven part is deep learning (DL), as this method shows promising results. In Chapter 2 the most important aspects of NNs are discussed. As mechatronic system the acrobot is chosen: an underactuated double pendulum with complex dynamics. The unmodeled phenomenon considered is the friction in the joints of the acrobot. Although it is commonly known that friction is present in all kind of systems, it remains very hard to provide analytical friction models. Chapter 3 will be dealing with the physics governing the acrobot system, with special attention to friction modeling. There is no acrobot available in the lab yet, so another objective is to construct a functioning acrobot setup to conduct experiments on. Chapter 4 is devoted to the practical work done in constructing the setup. Finally in Chapter 5, the methodology followed to combine DL and physics for the acrobot example is explained.

(23)

2

Neural Networks

NN modeling is a so-called black box, data driven approach. Based on input data, a NN can learn to mimic highly nonlinear systems where more classical statistical approaches (such as linear regression) fail. In contrast to other approaches, most NNs contain a vast amount of parameters that have to be learned. In order to generalize well (to avoid overfitting), it is re-quired to have a dataset large and diverse enough to accurately represent the system’s behavior. Furthermore much computing power may be required, especially in DL, where very large net-works are used. Roughly speaking, a NN can be used for either classification or regression. This master’s dissertation is focused on the regression case, as some physical phenomena will have to be estimated numerically. The next sections try to give a brief overview of the main aspects of NN theory and working principles. The focus is on the type of networks that will be used later on, with a distinction made between feedforward and recurrent neural networks (RNNs).

2.1 Feedforward Neural Networks

A very common architecture is the multilayer perceptron (MLP), of which a schematic example is pictured in Fig. 2.1. The number of inputs, layers, neurons and outputs is arbitrary just to illustrate the concept. The explanation of MLPs is mostly based on [22].

(24)

2.1. FEEDFORWARD NEURAL NETWORKS 23

Fig. 2.1: Symbolic representation of the MLP architecture.

To map the input to the output, a forward calculation has to be done. A linear combination of inputs is given to the neurons in the hidden layer. A bias can be included in this linear combination. The weights used in the linear combination are the trainable parameters of the network (more on this later). After this, a so-called activation function f is applied to the linear combination. This is a nonlinear function that helps to represent complex relationships, which would not be possible with just linear combinations. This is then passed on to the next layer, until the last layer is reached. A neuron j thus receives a weighted sum xj of the outputs yk of

the previous layer. The weights are represented by wkj.

xj =

X

k∈Kj

wkjyk (2.1)

Kj represents the set of neurons from the kth layer that feed neuron j in the next layer. The

output of the neuron then becomes:

yj = f (xj) (2.2)

Suppose now that the jth layer is the output layer for a regression problem. Then an error E can be defined between the predicted output yj and the target tj: this is called the loss function.

Note that the loss function below is used just for sake of explanation. In practice a large variety of loss functions exists. They all define how precise a prediction is: the lower the loss, the better.

E = 1 2 J X j=1 (tj− yj)2 (2.3)

The next step is to find the gradient of the loss with respect to the weights of the net, the backpropagation step. This is done by applying the chain rule of differentiation.

∂E ∂wkj = ∂E ∂yj ∂yj ∂xj ∂xj ∂wkj (2.4) An additional definition is used to treat the first two factors as a single quantity:

δj :=− ∂E ∂yj ∂yj ∂xj (2.5)

(25)

24 CHAPTER 2. NEURAL NETWORKS The last factor in (2.4) is easily derived:

∂xj

∂wkj

= yk (2.6)

Furthermore it is clear that ∂yj

∂xj only depends on how the activation function is defined, let’s call

this term f0 from now on. If j is an output layer, then it follows that: ∂E

∂yj

=_−(tj− yj) (2.7)

Otherwise if j is a hidden layer, the quantity can be calculated by applying the chain rule a second time: ∂E ∂yj =X i∈Ij ∂E ∂yi ∂yi ∂xi ∂xi ∂yj (2.8) Wrapping up all the previous results reveals that the partial derivative of the loss can be calcu-lated using only terms that have been obtained in the forward calculation:

∂E ∂wkj

=−X

i∈Ij

(δjwji)f0yk (2.9)

This information is crucial for the training procedure. It gives insight in how each weight influences the loss. Most optimization schemes are gradient based, meaning that weights are adapted in the opposite direction of the gradient of the loss with respect to the weight itself. This is the direction in which the error is expected to decrease the most.

∆wkj =−α

∂E ∂wkj

(2.10) The parameter α is called the learning rate, it determines the size of the optimization steps. The previous passages are meant to give a theoretical intuition on the backpropagation algo-rithm. The next part aims to give a more practical insight in the training and optimization process. As the activation function is very important for the performance of the model, some of the most commonly used activation functions and their specific (dis)advantages are discussed here. The discussion is based on [23]. The sigmoid function is an activation that is used a lot in problems where the output must represent a probability (e.g. in classification). It is in particular useful for this type of problems because the function maps the real axis to [0, 1]. The function is differentiable and given by:

f (x) = 1

1 + e−x (2.11)

Nowadays sigmoid is not so frequently used anymore for hidden layers. Sigmoid suffers from the vanishing gradient problem as more and more layers are stacked on each other. The gradient of the loss is calculated by applying the chain rule multiple times. In this manner you can see that the gradients of different layers are multiplied. The derivative of the sigmoid can become

(26)

2.1. FEEDFORWARD NEURAL NETWORKS 25 small and when a lot of small numbers are multiplied, the result will be extremely small. In this way learning becomes very hard. Another problem is that sigmoid is not zero centered, which makes the gradient updates move in different directions. To overcome some of these problems, the hyperbolic tangent (tanh) has been suggested. The tanh activation function has a similar S-shape as the sigmoid, but it maps inputs to the interval [-1, 1] instead of [0, 1]. So it is zero centered. As the shape is similar to the sigmoid, the vanishing gradient problem is not solved. Today, one of the most popular activation functions is the rectified linear unit (ReLU), given by:

f (x) = max(0, x) (2.12)

An advantage of ReLU is that it makes computations faster. Compared to tanh or sigmoid, no complex calculations such as division or exponentials are required. There is no longer a vanishing gradient problem with ReLU anymore. A problem encountered with ReLU is that of dying neurons, because every negative input gets mapped to zero. To overcome this problem, the ’leaky ReLU’ was introduced, with a small slope in the negative area.

(a) Sigmoid (b) Tanh

(c) ReLU (d) Leaky ReLU

Fig. 2.2: Common activation functions used in NNs.

As explained before, training of the network weights happens with some variation on the gradient descent algorithm (2.10). A handy overview of the most frequently used algorithms is given in [24]. The algorithms can be divided into three main classes. The first one is batch gradient descent, where the gradient of the loss function is computed with respect to the model parameters for the entire dataset. As all samples have to be considered for one update, this can be very slow.

(27)

26 CHAPTER 2. NEURAL NETWORKS A second alternative is stochastic gradient descent (SGD), where an update is performed for every single sample. This is much faster, but it introduces a lot of variance into the parameter updates and the learning curve will show a lot of fluctuations. To combine the good properties of both approaches, mini-batch gradient descent is introduced: updates are done per mini-batch (= a number of samples smaller than the total population). The term ’epoch’ denotes how many times the optimization goes through the entire dataset (this is needed for convergence). Nowadays the third option is preferred. Although a bit ambiguous, the term SGD is also used for some mini-batch optimization algorithms in literature. An important parameter for gradient descent is the learning rate. Low learning rates lead to very slow convergence, while a too large learning rate causes the loss to fluctuate around the minimum or even diverge. In the past, learning rate schedulers were commonly used to prevent these problems: the initial learning rate is set rather high to take large steps and then it is gradually decreased to get smooth convergence to the minimum. Although this was an improvement, there remained some problems. First, the scheduler had to be set before training the dataset and was not adaptable to the specific dataset. Second, the same learning rate applies to all parameters even though they might have different frequencies. According to [24] another big challenge for optimizers is to avoid getting stuck in suboptimal local minima. To deal with the issues mentioned, several adaptive learning rate schedules exist today. Some examples are SGD with momentum, Nesterov Accelerated Gradient, Adagrad, Adadelta, RMSProp and Adam.

When training a NN one must not only look at the training loss. The performance of the model on unseen data is very important: overfitting must be avoided at all costs. A fraction of the data can be left out of the training set to later validate the finished model on. The gap between training and validation loss should be as small as possible. An even better approach to test the performance of the model on unseen data is k-fold cross validation. In this method the dataset is split into k equal fractions. Then the model is trained k times: each time with a different split held out. This gives a more complete view of the model’s performance compared to just one test. For instance a boxplot of the experimental results can be plotted.

Fig. 2.3: Illustration of 5-fold cross validation. Training set indicated in grey while the validation set is orange.

(28)

2.2. RECURRENT NEURAL NETWORKS 27 tune: number of neurons, batch size, learning rate... Those are called hyperparameters. Getting these parameters right requires a lot of experience and knowledge, but even that is sometimes not enough. One option is to perform a grid search: iterating through all the points in the parameter grid and finding the hyperparameters that lead to the best performance (for example a k-fold cross validation metric). The problem is that this task soon takes up a large amount of time with increasing grid size. Additionally some parameters may not have a large influence on performance. Grid search cannot detect this and a lot of useless iterations are done. An alternative is to only investigate a fixed number of random hyperparameter combinations, the random search procedure. The computational burden of random search is lower and [25] showed that it might even outperform grid search under some conditions. Another promising recent method is Bayesian hyperparameter optimization [26]. First some random hyperparameter combinations are evaluated, this corresponds to random search. Then a number of Gaussian processes are fit between the observations. Finally the next point to evaluate is chosen based on the concept of maximizing the expected improvement.

There are several methods to avoid overfitting. One of those is weight regularization. Large weights are a sign that the NN is overfitted, a small variation in input might give a totally wrong result. To avoid this, weight regularization penalizes large weights with a term in the loss function. For the example of (2.3) the loss function would become (with λ a hyperparameter to choose): E = 1 2 X (t_{− y)}2₊λ 2||w|| (2.13)

Here _{||w|| is a symbolic representation of the norm of the network’s weights. Another technique} is dropout. In dropout, nodes are removed during the training process in a probabilistic manner to prevent overfitting [27]. This prevents overdependence on certain nodes.

2.2 Recurrent Neural Networks

Recurrent Neural Networks (RNN) are a type of NNs that are particularly useful to process time series inputs. The influence of previous inputs in time is captured with a ’hidden state’ h, this is the biggest difference with feedforward NNs. In this way the RNN can learn from the past. A symbolic representation of a RNN is shown in Fig. 2.4. For a RNN, there are three weight matrices. Note that these matrices are ’shared across time’, meaning that they are the same every time step. A bias term can be included. The equations for the forward calculation at time step t are:

ht= f (W1xt+ W2ht−1) (2.14)

yt= f (W3ht) (2.15)

Here f again represents the activation function. The activation function can be different for the forward and recurrent step. RNNs are trained with backpropagation through time, a modified

(29)

28 CHAPTER 2. NEURAL NETWORKS version of regular backpropagation. As input sequences can be very long, RNNs suffer from the vanishing gradient problem as well. In 1997, the authors of [28] proposed the long short-term memory (LSTM) architecture to solve this problem. This architecture has proven to work very well and is often used nowadays. Inside a LSTM layer there are multiple ’gates’ (input, output and forget) that regulate which information is relevant and pass it to the ’cell state’. Another popular improved RNN architecture is the gated recurrent unit (GRU), which has a slightly simpler architecture compared to the LSTM [29].

Fig. 2.4: Symbolic representation of a RNN architecture.

In this dissertation the NNs are implemented in Python using the Keras functional API [30] with TensorFlow backend [31]. In this way mathematical formulas do not need to be hard coded. TensorFlow uses automatic differentiation to calculate the gradients analytically.

(30)

3

Acrobot System

3.1 Introduction

In order to develop a hybrid model, it is necessary to understand the dynamics of the acrobot. The acrobot is an underactuated robotic arm consisting of two links joined together. Underac-tuation means that the system has fewer actuators than degrees of freedom. Only the second joint (θ2 on Fig. 3.1) is actuated, the first joint rotates freely. The nonlinear dynamics and the

underactuation make the acrobot an interesting setup to test control methods on, as these are typically very hard problems. The acrobot control problem consists of two parts. First, the two bars have to be swung from the downwards position to the upwards, unstable equilibrium point ((θ1, θ2) = (π, 0) on Fig. 3.1). Then when the upwards position has been reached, the two links

have to be balanced.

Fig. 3.1: Angle convention acrobot. 29

(31)

30 CHAPTER 3. ACROBOT SYSTEM Because the system is underactuated, feedback linearization is not possible for the swing up problem (proven by [32]). There are several papers describing this problem. The paper by Spong [33] on this topic proposes a technique called partial feedback linearization, but there are alternatives such as sliding mode control [34]. More recently, the acrobot is used as well to test reinforcement learning algorithms on [35]. It can be concluded that the acrobot is an interesting mechanism for research and educational purposes.

3.2 State Space Model

The dynamic state space model is developed using the Euler-Lagrange formalism. The formalism is very useful for rather simple robotic systems. A similar approach as in [36] is used. The Euler-Lagrange equation has the following general form.

d dt( ∂L ∂ ˙qi )− ∂L ∂qi = Q (3.1)

The Lagrangian, L = T _{− V , is the difference between the kinetic (T ) and the potential (V )} energy of the system. In the case of the acrobot, the generalized coordinates (qi) are the angles

θ1 and θ2 as defined in Fig. 3.1. Q represents the generalized non-conservative forces acting

on the system. These forces comprise the motor torque and the friction torques acting on both joints. All parts of the acrobot that are important to the Euler-Lagrange model are depicted and labeled on Fig. 3.2. Each part is modeled differently according to its geometry. The parameters of all parts are carefully measured and weighed. Some shapes are simplified to avoid too complex calculations, as this is out of the scope of the master’s dissertation. However, it is possible to construct more complex models by employing powerful CAD software to calculate centers of mass and moments of inertia. The design choices that led to this specific setup are discussed in the upcoming chapter.

(32)

3.2. STATE SPACE MODEL 31 1. Supporting shaft

The supporting shaft is modeled as a solid cylinder. The moment of inertia is given by: Jshaf t = 1 2Mshaf tR 2 shaf t (3.2) Mshaf t 1.38 [kg] Rshaf t 0.01 [m] Jshaf t 6.91·10−5 [kg·m2]

TABLE II: Parameters supporting shaft.

2. Clamp slip ring

Because the outer diameter of the shaft and the inner diameter of the slip ring do not correspond, clamping the rotating part of the slip ring directly on the shaft is not possible. Therefore an extra nylon bushing has to be pressed onto the shaft. This part is modeled as a hollow cylinder. Jclamp= 1 2Mclamp(R 2 clamp+ r2clamp) (3.3) Mclamp 0.097 [kg] Rclamp 0.019 [m] rclamp 0.01 [m] Jclamp 2.24·10−5 [kg·m2]

TABLE III: Parameters clamp slip ring.

3. Clamping bush first link

This component connects the supporting shaft to the first link. An estimate of its moment of inertia is calculated by using the formula for a hollow cylinder again.

Mbush 0.26 [kg]

Rbush 0.014 [m]

rbush 0.01 [m]

Jbush 3.81·10−5 [kg·m2]

TABLE IV: Parameters clamping bush.

4. First link

(33)

32 CHAPTER 3. ACROBOT SYSTEM of mass given by:

Jlink1=

1 12m1(H

2

link1+ Wlink12 ) (3.4)

The distance between the center of the supporting shaft and the center of mass of link 1 is given by L1. m1 0.135 [kg] Hlink1 0.175 [m] Wlink1 0.05 [m] L1 0.0875 [m] Jlink1 3.73 ·10−4 [kg·m2]

TABLE V: Parameters link 1.

The moments of inertia of the first four parts are bundled into J1, as they all rotate around

the same axis:

J1= Jshaf t+ Jclamp+ Jbush+ Jlink1 (3.5)

5. Motor assembly

The motor assembly consists of the encoder, the motor and the reduction gear. It is modeled as a point mass, so only the mass of the part and the distance from the origin (L2) are relevant.

m2 0.44 [kg]

L2 0.153 [m]

TABLE VI: Parameters motor assembly.

6. Second link

L3 represents the distance between the shaft of the servomotor and the center of mass of

link 2. m3 0.12 [kg] Hlink2 0.185 [m] Wlink2 0.025 [m] L3 0.093 [m] J2 3.39 ·10−4 [kg·m2]

TABLE VII: Parameters link 2.

7. Extra mass

(34)

3.2. STATE SPACE MODEL 33 able to vary its moment of inertia. This might come in handy for future research. The extra mass is again modeled as point mass, with L4 being the distance between the servo

motor and the added mass.

m4 0.084 [kg]

L4 0.175 [m]

TABLE VIII: Parameters extra mass.

In order to get the kinetic and potential energy of the system, the Cartesian coordinates of four points expressed in function of the angles need to be known: center of mass of link 1, the motor, center of mass of link 2 and the extra mass. In this way their speeds can be calculated by differentiation.

• Center of mass link 1

v1 = L1˙θ1 (3.6)

• Motor assembly

v2 = L2˙θ1 (3.7)

• Center of mass link 2

x3= L2sin(θ1) + L3sin(θ1 + θ2) (3.8)

y3 =−L2cos(θ1)− L3cos(θ1 + θ2) (3.9)

• Extra mass

x4= L2sin(θ1) + L4sin(θ1 + θ2) (3.10)

y4 =−L2cos(θ1)− L4cos(θ1 + θ2) (3.11)

In this way we get: T = 1 2J1˙θ 2 1+ 1 2m1v 2 1+ 1 2m2v 2 2 + 1 2J2˙θ 2 2+ 1 2m3( ˙x 2 3+ ˙y32) + 1 2m4( ˙x 2 4+ ˙y42) (3.12) V =_−m1gL1cos(θ1)− m2gL2cos(θ1) + m3gy3+ m4gy4 (3.13)

The Lagrangian equation (3.1) is evaluated with a symbolic solver. As there are two generalized coordinates, two equations are obtained. After some algebraic manipulations the following form is found (with A a 2x2 matrix in function of the state variables):

" ¨ θ1 ¨ θ2 # = A(θ1, θ2, ˙θ1, ˙θ2)−1 " f1(θ1, θ2, ˙θ1, ˙θ2)− Tf 1 f2(θ1, θ2, ˙θ1, ˙θ2)− Tf 2+ T # (3.14)

(35)

34 CHAPTER 3. ACROBOT SYSTEM This form is suited for state space modeling. T represents the direct torque input given by the actuator, while Tf 1 and Tf 2 are the friction torques.

One of the main subjects of the master’s dissertation is to identify the friction in the joints, therefore an overview of classical friction models is given in the next section.

3.3 Friction Modeling

Friction is a phenomenon that is encountered in almost all mechanical systems, from the simplest machines to high-tech applications. People were interested in this phenomenon from early on, which led to the birth of tribology: the science of friction. Numerous friction models have been developed throughout the years, from very simple ones to models that start from a molecular level [37]. It is very hard to make a uniform model that generalizes well for different situations, because friction is a nonlinear and dynamic phenomenon. There are models that only describe dry friction while others only describe hydrodynamic friction. In most practical cases friction is a result of both effects, as bearings are typically lubricated with grease or oil. In the following sections an overview of the most commmon friction models is given. Each model has certain advantages but also drawbacks compared to others. In most cases the influence of friction on the system can not be neglected: a good friction model is essential to control a mechatronic system that requires high accuracy (e.g. in order to design a well functioning compensator [38]). The acrobot situation is a typical example where a good friction model is important, as friction has a large influence on the way the links rotate. The approach followed in this master’s dissertation can be generalized to other (similar) mechatronic systems. Most parts of the acrobot are easy to measure/weigh and with this information a dynamic state space model was constructed based on the Euler-Lagrange equation (section 3.2). In contrast, the manufacturers of the bearings and servomotor do not provide clear friction models or data. The goal is to find out if it is possible to develop an adequate friction model that improves the global acrobot model significantly.

3.3.1 Static Friction Models

Static friction models consist of a (nonlinear) mapping between the joint’s velocity and the resulting friction torque. To find this mapping in practice, the friction force is measured at different steady state speeds and afterwards plotted on a graph. Both [37] and [39] dedicate a section to describe static friction models. The discussion here is based on those papers and includes the classical static friction models. The models in the papers are often described for linear motion, but they can be generalized to rotating motion. A main component of dry friction is the Coulomb effect. The theory of Coulomb states that friction opposes relative motion. The magnitude of the force is independent of speed. In this manner, the friction is modeled as an

(36)

3.3. FRICTION MODELING 35 ideal relay. The magnitude of the friction force depends on the normal force Fn, a friction factor

µ and the sign of the velocity v:

F = µFnsgn(v) (3.15)

In fully lubricated regime viscous friction is encountered, meaning that the motion resisting force is proportional to the velocity (here c stands for this constant factor).

F = cv (3.16)

Viscous friction originates from the shear stress between fluid layers. In some control applica-tions friction is represented by only viscous friction in order to keep the control and observers linear. However, for most applications this is an oversimplification. The last commonly included element in static friction models is the Stribeck effect, again a phenomenon more associated with hydrodynamic lubrication. Its properties were studied for the first time at the beginning of the 20th century by German engineer Richard Stribeck. Stribeck distinguished three phases.

1. Boundary lubrication: direct contact between the two surfaces at standstill/very low speeds. Friction is determined by surface roughness.

2. Mixed lubrication: the load is supported by both surface asperities and the fluid. 3. Hydrodynamic lubrication: the load is completely supported by a fluid layer.

The friction is highest in the boundary lubrication phase, [37] refers to this as ’stiction’. The friction then decreases in a nonlinear fashion during the mixed lubrication phase. Reference [40] states that the decrease is due to an initial buildup of hydrodynamic pressure. Furthermore, the author writes that the same behavior holds true for dry friction. Finally during the hydrody-namic lubrication phase, the friction starts to increase again due to the viscous friction effect. To arrive at a good static friction model, [37] combines all the elements mentioned before and summarizes this in an equation for the friction force:

F =          g(v), if v6= 0 Fe, if v = 0 and |Fe| < Fs Fssgn(Fe), otherwise (3.17)

Here Fe represents the external force while Fs stands for the stiction force. The equation above

shows that at standstill the friction can not be explained as a function of velocity alone. The function g(v) includes Coulomb, viscous and Stribeck friction. A typical (idealized) example of such a friction function with the three elements included is shown in Fig. 3.3. Sometimes this curve is simply referred to as the Stribeck curve.

(37)

36 CHAPTER 3. ACROBOT SYSTEM

Fig. 3.3: Idealized version of a Stribeck curve.

The parameters of the curve are different depending on geometries, materials, lubricants, tem-perature... If one does not know the external force Fe exactly then the function g(v) on its own

is used. Reference [39] states that the usage of a stepping function at the zero velocity crossing is bad for numerical stability when a state observer is used. The paper suggests replacing the stepping function with more smooth functions, such as the hyperbolic tangent. In practical applications (for instance a DC motor [41]), the static friction curve is obtained by letting the rotor rotate at a steady state velocity with no load. The friction can then be measured by scaling the stator current that is needed to get to that speed. A similar approach can be followed for bearings, but then an external motor is needed of which the friction torque has to be taken into account as well. The steady state method for the mounted acrobot is not so straightforward, as the two links influence each other in a great deal (3.14). As the system is underactuated it would be a whole problem on its own to make the first link rotate at a constant speed. The static friction curve of the motor could be measured before assembling the acrobot, but after a long period of usage the curve can already look different due to wear and other environmental conditions.

3.3.2 Dynamic Friction Models

Although static models are commonly used to describe friction, they ignore some interesting dynamic aspects. The most important one is frictional hysteresis, or also called frictional lag. In [40] it is explained that in the case of lubricated friction, this lag is caused by the time it takes to modify the thickness of the lubricant film. A similar effect is encountered in dry friction. The conclusion is that the friction force is not able to follow the change in velocity immediately. Combining this observation and the shape of the Stribeck curve, it follows that the friction force when accelerating will be higher than when decelerating. This is thoroughly investigated

(38)

3.3. FRICTION MODELING 37 in [42] (for dry friction), where Fig. 3.4 originates from. The friction force hysteresis curves are obtained by giving a sinusoidal velocity profile with fixed offset as input.

Fig. 3.4: Hysteretic behavior of friction, with velocity on the x-axis and friction force on the y-axis. The size of the hysteresis loop increases with excitation frequency.

It is observed that the shape of the hysteresis loop depends on the excitation frequency as well: the higher the frequency, the larger the loop. Reference [39] describes the hysteresis loops in a similar fashion. The modeling approach there is to put an adaptive first order lowpass filter in series with the static friction model. In this way the hysteretic effect is simulated. The filter is adaptive to emulate the effect of the frequency on the hysteresis loop shape. A large number of dynamic friction models exists, the best known are the Dahl and the LuGre model. For this dissertation the technical details of these models are of less importance, the focus is more on the qualitative dynamic behavior. An explanation of such models can be read in [37] and [40].

(39)

4

Acrobot Lab Setup

4.1 Introduction

One of the objectives of this master’s dissertation is to construct an acrobot lab setup. There are a lot of papers describing the acrobot swing up and balancing control problem, but almost none of them describe the practical construction of the acrobot. The setup must be suited for further research and educational purposes. Another important aspect is that a camera will be used instead of conventional angle sensors. It is investigated if this could be a worthy alternative for systems where it is not possible to mount encoders on the shafts to measure the angles and velocities. All of this imposes certain conditions on the design of the setup.

• The acrobot must be large enough for demonstrations. • The design must be modular in case of future adaptions. • The setup must be straightforward to use.

• The angles of the links have to be visible at all times.

The next sections describe the steps that were taken in the design process in a more narrative way. Technical details, datasheets and drawings can be found in appendix A.

(40)

4.2. MECHANICAL DESIGN 39

Fig. 4.1: The finished acrobot lab setup.

4.2 Mechanical Design

4.2.1 General Considerations

There are two options to make the first link able to rotate freely. The first alternative is pressing a bearing into the link and sliding this assembly over a fixed shaft. This has the benefit that the inertia of link one remains small (as the shaft does not rotate). A small inertia is needed to be able to swing up link one. However, the mass of the design is too high to rely on a single bearing for support. Therefore the link is clamped on a shaft and the shaft on its turn is supported by bearings. As the front of the setup has to be visible at all times, the bearings have to be mounted at the back of the mechanism. This implies that the shaft and the bearings should be firm enough to support the construction from one side. For this reason a pair of large pillow block bearings is chosen. This type of bearings is easily mountable on a solid plate and does not require a lot of maintenance, only once in a while an oil refill.

For the swing up and balancing problem it is required that the second link has a large enough moment of inertia, while the moment of inertia of the first link can not be excessive. The second link must be able to generate enough force to get the first one moving. The second condition is rather difficult to deal with, as the motor has to be attached to the first link. On one hand, a powerful servomotor is needed to rotate the second link with enough torque. On the other

(41)

40 CHAPTER 4. ACROBOT LAB SETUP hand, more powerful servomotors tend to be heavier which increases the moment of inertia of the first link again.

4.2.2 Torque versus Weight Trade-off Motor

There are two typical ways to attach the motor to the first link:

• Attach the motor directly to the end of the first link.

• Attach the motor to the beginning of the first link and connect it by means of a belt to a pulley at the end of the first link.

There are some methods to work around the torque versus weight trade-off. If a pulley system is chosen, the diameter of the pulley can be increased to get a higher torque ratio. In this way a smaller motor can be selected. However, a large pulley diameter increases the mass and moment of inertia of the first link again. Thus one has to carefully iterate. If the motor is attached directly, there is a torque increasing option as well: a gear reduction step can be attached to the motor. This increases the mass as well, but compared to the pulley option the increase in torque is much higher (in the final design a ratio of 18:1 is used). Another disadvantage of the pulley system is that the control response is slower because of slip in the belt. Belts must be tightened at regular intervals as well, so an extra mechanism for this would be needed too. Regarding all previous arguments, the option of the reduction gear was chosen. The motor is attached directly to the end of the first link (Fig. 4.1). The type of servomotor chosen is discussed in section 4.3.

4.2.3 Design of the Links

Because the motor is attached to the end of the first link, it is necessary to limit the length of the link (inertia is proportional to distance squared). A length of 0.2 m was chosen for both links. The rather short second link is in contradiction with the inertia condition stated before. But the bigger the acrobot, the larger the distance required between the camera and the setup to capture the angles. To be able to add extra moment of inertia, a hole was drilled into the end of the second link so that a bolt and nut can be mounted. In this way, extra mass can be added very easily while experimenting. It should be noted that the acrobot is designed in such way that all parts are easy mountable and replaceable for future experiments. The design is very modular. The links are made of aluminum, as this is a light and easy to handle material.

(42)

4.3. ELECTRICAL DESIGN 41

4.2.4 Slip Rings

The servomotor is attached to the first link and thus will rotate with it. This is a very important problem when designing robot arms, one must ensure that the electrical cables will not get tangled or even come loose. This problem can be solved by using a slip ring. A slip ring is a system that transmits electrical signals from a rotating mechanism (the acrobot) to a stationary part (the electric drive and control in this case). The working principle of the slip ring can be compared to that of a simple bearing. The rotor is clamped to the shaft and rotates with it, while the stator remains fixed and is connected to an external frame. In this way the wires do not get tangled (Fig. 4.2, obtained from [43]). The inner diameter of the slip ring is larger than the diameter of the shaft, so an additional bushing was clamped onto the shaft. To connect the motor wires to the slip ring wires, Molex connectors were used. The wires must be connected while the acrobot is already assembled, so this clip system is easier than soldering. A point of attention is the AWG compatibility of motor wires, slip ring wires and Molex connectors, especially for the very thin Hall sensor wires. AWG stands for American Wire Gauge and is a measure of wire thickness. With the slip ring in place the system is free to rotate 360 degrees. A white painted plate is attached to the slip ring, intended to eliminate shadows and background noise for the data acquisition of the camera.

Fig. 4.2: Working principle of slip rings.

4.3 Electrical Design

An important design choice is the type of servo actuator. It should satisfy specific constraints: • Lightweight

Estimating Unmodeled Physical Phenomena by Combining Deep Learning and Physics Based Models with Application on an Acrobot

Models with Application on an Acrobot

Combining Deep Learning and Physics Based

Estimating Unmodeled Physical Phenomena by

Counsellors: Ir. Tom Staessens, Ir. Wannes De Groote

Supervisor: Prof. dr. ir. Guillaume Crevecoeur

Models with Application on an Acrobot

Combining Deep Learning and Physics Based

Estimating Unmodeled Physical Phenomena by

Counsellors: Ir. Tom Staessens, Ir. Wannes De Groote

Supervisor: Prof. dr. ir. Guillaume Crevecoeur

Permission of Usage

Preface

Estimating Unmodeled Physical Phenomena

by Combining Deep Learning and Physics Based

Models with Application on an Acrobot

Estimating Unmodeled Physical Phenomena by

Combining Deep Learning and Physics Based

Models with Application on an Acrobot

Contents

List of Abbreviations & Symbols

List of Figures

List of Tables

1

Introduction

2

Neural Networks

2.1

Feedforward Neural Networks

2.2

Recurrent Neural Networks

3

Acrobot System

3.1

Introduction

3.2

State Space Model

3.3

Friction Modeling

4

Acrobot Lab Setup

4.1

Introduction

4.2

Mechanical Design

4.3

Electrical Design