Learning Multi-Agent Control with OROCOS

(1)

University of Twente

EEMCS / Electrical Engineering

Control Engineering

Learning Multi-Agent Control with OROCOS

Iker Rezola

MSc report

Supervisors Prof.dr.ir. A. van Amerongen dr.ir. T.J.A. de Vries MSc P.B. Dao January 2009 Report nr. 001CE2009 Control Engineering EE-Math-CS University of Twente P.O. Box 217 7500 AE Enschede The Netherlands

(2)

Iker Rezola

(3)

(4)

Summary

The title of the Master Thesis that the reader has obtained is Learning Multi-Agent Control with Orocos. The compact and accurate name shows perfectly the topics this thesis deals with. The thesis title can be splitted in three parts. The main goal of this thesis has been focussing on these with this three groups.

1 Learning: During the last years, new control techniques have been developed around the world. The traditional feedback controller is still one of the best known. However, feedback controllers can have some faults, to defeat some of those faults a learning feedforward controller (LFFC) has been used. The use of feedforward controllers is becoming common in the last years, the research done in this type of controllers is leading to better ways of control. Instead of using a "simple" feedforward controller, a learning feedforward controller has been used in this project. This way of controlling is being analyzed and developed in some universities, the purpose is to improve results while the system keeps on working.

2 Multi-Agent Systems: MAS are also being analyzed and developed . Dividing a complex problem into smaller ones to achieve the main goal is something that has been done since the begining of the human race. The last years it has been proven that this way of working is really useful in computer science, and it has lead towards the most efficient algorithms, such as quicksort or FFT. This way of working has been named divide and conquer algorithms.

During the thesis different agents have been described to obtain a wanted behaviour of the plant.

3 OROCOS: Different software applications could be used to implement LFFC and Multi-Agent system. The chosen one has been Orocos. Orocos is an open source-free code project. This software is being developed in a cooperation of different universites. Orocos is a powerful project that in the present and future will continue developing and looking for a open control engineering software.

In this report, these three subjects are explained in more detail and experiences obtained while working with them are documented.

In the end of the report, it will be seen that the three groups can work together leading to a good behavior of the plant.

(5)

(6)

List of Figures

3.1 Orocos Logo. . . 5

3.2 Code::Blocks enviroment settings . . . 6

3.3 Code::Blocks project tree . . . 6

3.4 Port Bifurcation. . . 13

3.5 Non port Bifurcation. . . 13

4.1 DemoLin photo . . . 15

4.2 Model of DemoLin. . . 16

5.1 Parallel PID block diagram. . . 17

5.2 Continuous block diagram . . . 18

5.3 Comparisson among different analog filters. . . 19

5.4 Frequency response of the controller. . . 20

5.5 Open loop system frequency response. . . 20

5.6 Applied reference to the system. . . 21

5.7 Applied acceleration setpoint to the system. . . 21

5.8 Discrete block diagram. . . 22

5.9 Continuous root locus diagram of the open loop system, including: controller, filter, actuator and plant. . . 23

5.10 Connection Map of the feedback controller. . . 24

5.11 Setpoint and output comparisson in the simulated system. . . 25

5.12 Error of the simulated system. . . 26

5.13 Control Signal of the simulated system. . . 26

5.14 RMSD between simulated outputs. . . 27

6.1 Time Index block diagram . . . 29

6.2 A linear process with a non-linearity . . . 30

6.3 State Index block diagram . . . 31

6.4 1^st order B-spline . . . 32

6.5 2^ndorder B-spline . . . 32

6.6 3^{r d}order B-spline . . . 32

6.7 Mapping example . . . 33

6.8 Feed-Forward connection map. . . 36

6.9 LFFC control diagram . . . 37

6.10 Error of the system with different amount of data to fit . . . 39

6.11 Error of the system with different order B-splines . . . 40

6.12 Error of the system with different number of used B-splines . . . 40

(9)

6.13 Error of the system using a feedback controller. Set point is periodic every 6s.

Mass is changing. . . 43

6.14 Error of the system using a LFFC controller. Set point is periodic every 6s. Mass is changing. . . 44

6.15 Feedback and feed-forward control signals comparisson using a LFFC and a periodic setpoint of 6s. Mass is changing. . . 45

A.1 Schematic diagram of DemoLin . . . 51

B.1 PID block diagram . . . 53

B.2 PD controller frequency response . . . 54

B.3 Frequency response of the open-loop system, having Kc·G(s) · H(s) . . . . 54

B.4 Frequency response of the plant with the PD controller in open-loop . . . 55

(10)

1 Introduction

1.1 Background

The creator of this thesis landed in The Netherlands on the 1^st September of 2008. The MSc project that has been carried out would be a link between previous research projects that were developed at the University of Twente. These research projects are pointing to a number of different techniques or methods that are applicable in advanced control of mechatronic systems. Different projects have been done in the last years in the University of Twente but, for this thesis, it has been focused on two groups:

1 Learning Feedforward Control: LFFC controllers have been developed in the University of Twente since 2000 (22). Different PhD theses have concluded (6, 5) that this type of controller is a good way of controlling mechatronic systems.

2 Multi-Agent Control Systems: MAS is also a relatively new concept in control engineering (2001) (21). Since van Breemen wrote his thesis several master theses and individual assig- ments have continued developing the topic of Multi-Agent Control System (9, 10, 3).

Nevertheless, no projects have been done before in the University of Twente using OROCOS as a control framework.

1.2 Problem definition

The first step towards solving a problem is defining the problem. This report tries to explain how to solve the next problem:

Integrate LFFC, Multi-Agent methods and a modern implementation framework (Orocos) so as to obtain a suitable framework for advanced control of mechatronic systems.

Multi-Agent Control Systems have to become Tasks in Orocos, and LFFC as a pattern for in- corporation in a Multi-Agent Control System, with the specific property that learning is done asynchronously, in a separate non-realtime Task.

1.3 Objective

Goal of the project is to evaluate the feasibility of the proposed integration. This is to be done by designing and implementing a simple Multi-Agent Control System with path generation and PD/PID feedback control in OROCOS and subsequently adding a Learning Feedforward Com- ponent that can learn on-line or off-line.

1.4 Outline Thesis

A briefly overview of the chapters of this MSc project:

Chapter 2 - Agents and Multi-Agent Systems: Explains the definition of an agent and which can be the strong points of using a Multi-Agent Control System.

Chapter 3 - OROCOS: Shows the interesting features of using this open source project. How to program the different components and link among them. Different part of the code are shown to explain how to work with Orocos.

Chapter 4 - DemoLin: The basic plant that will be considered during evaluation.

(11)

Chapter 5 - Feedback Controller: Reviews the behavior of a well known feedback controller, that it has been tuned and implemented in Orocos.

Chapter 6 - Learning Feedforward Controller: Develops how the LFFC works and how it has been implemented in Orocos. In addition to this, the obtained results will be shown.

Chapter 7 - Conclusions: It is going to be discussed if working with Orocos as a framework is realiable enough.

Chapter 8 - Future works: Shows the different path that next developers can take to develop this project.

(12)

2 Agents and Multi-Agent Systems

2.1 Introduction

The strategy of solving complex control problems by decomposing it into partial control problems has a long history. The idea of using a sorted list of items to facilitate searching dates back as far as Babylonia in 200 B.C., while a clear description of the algorithm on computers appeared in 1946 in an article by John Mauchly (13). This strategy is called the divide-and- conquer approach (14). The approach basically consists of three steps:

1 Decomposing the overall control problem into a complete set of well defined partial control problems.

2 Solving the partial control problems.

3 Integrating the partial solutions into an overall solution.

2.2 Definition

For solving the partial control problems, agents are created. An agent can be defined as an entity which can solve a partial (control) problem. The combination of several agents creates a multi-agent system. Solving many partial problems (usingb agents), and integrating them to solve a more complex problem is what defines a multi-agent system. A multi-agent system is not only responsible for integrating all the partial solutions, it is also responsible for solving the conflicts between agents, such as, dependencies and coordination.

Although not an official definition of what an agent is, there are some minimal features that this type of entity has to have to be considered as an agent:

• Autonomous: An agent has always control over its own actions. If the right conditions are satisfied, the agent decides to become active. The main code is the responsible for describing the agents/components when the conditions are satisfied. Each component works in an autonomous way.

• Social ability: An agent should be able to cooperate with other agents in order to solve its own objective, or support other agents. That’s why coordination among different agents is important. Data-ports are going to be used for this purpose.

• Responsiveness: An agent perceives the environment and reacts on changes that occur in the environment. Sensor and actuator components will be in charge of perceiving and reacting in the plant.

• Goal-directed behavior: An agent does not act simply on the changes that occur in the environment. It has a goal-directed behavior and takes initiative where it is appropriate.

The generator component describes the goal that the plant component has to achieve.

The controllers (feedback and learning feedforward) are the ones that try to obtain a min- imum error between the setpoint and the output of the system.

Every component has its own functionality, defined by the standard methods and attributes.

The standard methods of a component are pieces of program code. These methods give the component the proper functionality. There are three types of attributes available:

• Inputs/Outputs: The inputs can only be read from the methods of a component and are written by other methods of other components connected to these inputs (output).

(13)

• Parameters: These variables are set when the component is specified and can only be read from the methods of the component. Components cannot read the parameters of other components.

• State variables: The state variables can only be read and written by the methods of the component.

All these attributes will be explained in chapters 5 and 6, showing how every single element of a control loop can be described as an agent.

2.3 Advantages and disadvantages of using a Multi-Agent System

In 2000, Stone and Velose published an article (18) explaining several good reasons to use a Multi-Agent System :

• Distributed problem: A MAS (Multi-Agent System) is suitable to solve problems that are distributed in nature.

• Robustness: A MAS that has redundant agents might tolerate failures in one or several of the agents, and is thus more robust then a centralized system.

• Scalability: Because agents are modular, it should be easy to add and remove agents from the MAS.

• Simpler implementation: Because agents are modular, implementing a MAS should be easier than implementing one overall centralized system.

• Parallism: To speed up the computation time needed for solving a problem, some parts could be executed in parallel. Each part could be represented as an agent.

In this chapter we have explained some benefits that a multi-agent system can have. Even though, it is not common in control engineering the use of multi-agent control system. The reasons for this could be:

• The field of multi-agent systems is relatively new. The research towards MASs stretches back over 20 years.

• Control theory has a strong mathematical foundation, whereas the field of multi-agent system mainly is focused on abstract descriptions of systems. This makes merging the two field more difficult.

(14)

3 Orocos

Orocos is the cornerstone of the Thesis. According to the project’s webpage (4): "Orocos" is the acronym of the Open Robot Control Software project. The project’s aim is to develop a general- purpose, free software, and modular framework for robot and machine control. The Orocos project supports 4 C++ libraries: the Real-Time Toolkit, the Kinematics and Dynamics Library, the Bayesian Filtering Library and the Orocos Component Library.

FIGURE3.1 - Orocos Logo.

3.2 Configuring Orocos

The first step for using Orocos is to install it. This has been done in accordance with to the installation manual, which can be found in Orocos webpage (4).

Code::Blocks (20) has been chosen to be the IDE that is used during development.

For the proper configuration of Code::Block some steps are recommended.

1 Install Code::Blocks. This can be done by downloading the software from Code::Blocks webpage or using Synaptic.

2 Open CodeBlocks.

3 Go to Settings->Environment...->Environment variables.

4 Create a new set called e.g. ’orocos’.

5 Add a variable:

Key: PKG_CONFIG_PATH

Value: $PKG_CONFIG_PATH:/usr/local/lib/pkgconfig

6 Ok.

7 Open the project. In this moment, 2 types of projects are been used. In one hand, there are the components type project and in the other hand the link project.

The components project contain all the data of the simulated components (feed-forward controller, feedback controller, plant, actuator, sensor and path generator). These components are divided into two files, a header file and a source file. The header file contains all the

(15)

FIGURE3.2 - Code::Blocks enviroment settings

parameters and variables which will be used in the source file. The source file, contains the behaviour of the plant, the methods which are going to be used and the difference equation of the component.

The link project need all the source codes of the components, and defines the configuration of the system, for example: which component is linked with each component or the data flow which will be between different ports.

The projects tree should look like.

FIGURE3.3 - Code::Blocks project tree 8 Open Project->Properties.

(16)

9 Open the tab EnvVars options.

10 Tick Select environment variables set to be applied, select the one which was created in step 4.

11 Ok.

12 Open Project->Build options.

13 Select the top level (project, not a target).

14 Select the compiler settings tab, then the Other options tab.

15 Enter the next line (including the back-ticks):

‘pkg-config --errors-to-stdout orocos-ocl-gnulinux orocos-rtt-gnulinux --cflags‘

16 Select the Linker settings tab. Then, Other linker options. Enter the next line:

‘pkg-config orocos-ocl-gnulinux orocos-rtt-gnulinux --libs‘

This step and the previous one will tell the compiler and the linker where to look for orocos libraries. This way of linking the project is going to be used when the B-spline is going to be applied.

17 Ok.

18 Save the project and exit Code::Blocks.

19 Restart Code::Blocks, open the project and compile.

3.3 Components

3.3.1 Definition

One of the most important elements of Orocos are known as components. Orocos webpage defines the components:

Each control component is defined as a "TaskContext", which defines the environment or "context" in which an application specific task is executed. The context is described by the five Oro- cos primitives: Event, Property, Command, Method and Data Port. This document defines how a user can write his own task context and how it can be used in an application.

A component is a basic unit of functionality which executes one or more (real-time) programs in a single thread. The program can vary from a simple C function over a real-time program script to a real-time hierarchical state machine. The focus is completely on thread-safe time determinism. Meaning, that the system is free of priority-inversions, and all operations are lock-free (also data sharing and other forms of communication such as events and commands).

Real-time components can communicate with non real-time components (and vice verse) trans- parently.

The Orocos Component Model enables:

• Lock free, thread-safe, inter-thread function calls.

• Communication between hard Real-Time and non Real-Time threads.

• Deterministic execution time during communication for the higher priority thread.

(17)

• Synchronous and asynchronous communication between threads.

• Interfaces for component distribution.

• C++ class implementations for all the above.

3.3.2 Programming components

The programming of the components has been done in a methodic way, this way, creating new components becomes an easy task. The components are composed by two source files and one header file.

The first source file is the main.cpp, this file is just a basic C++ "Hello world!". All the C/C++

need one main programme. If all the components have one main, when all the files have to be linked there would be many main functions, and the software could not be compiled. Having a simple main function allows to compile each component individually. The second source file is the responsible of showing the behaviour of the elements that is going to be be simulated.

After all the components are created, all the second files will be working together, and the first source code will be ignored.

The header file will be the part of the program where the libraries, variables, ports, methods and classes that will be used in the source file are defined.

For explaining how to work with components, we give an example and comment it section- wise.

3.3.3 Class

The user-defined data type, or class, is what distinguishes C++ from traditional procedural lan- guages. A class is a new data type that you or someone else creates to solve a particular kind of problem. Once a class is created, anyone can use it without knowing the specifics of how it works, or even how classes are built (8).

The PlantComponent has been analyzed because it is a good example, because it has all the elements that we are going to use in the rest of components.

The header file will contain the definition of the class and will look this way:

class PlantComponent : public RTT::TaskContext {

public:

/! Constructor (default)*/

PlantComponent(std::string name);

/! Default destructor.*/

~PlantComponent();

virtual double getFact();

virtual double getXact();

Method<double(void)> getFactMethod;

Method<double(void)> getXactMethod;

private:

ReadDataPort<double> inpPortF; /!< Input port Force /

WriteDataPort<double> outPortX; /!< Output port Position /

(18)

double x; /*!< Variables>

...

bool startHook(); /!< Start task situation execution hook /

void updateHook(); /!< Periodic called hook /

void stopHook(); /!< End task situation execution hook /

};

Once the header file has been created properly, the source file can call the class this way.

PlantComponent::PlantComponent(std::string name):RTT::TaskContext(name), getFactMethod("getFact", &PlantComponent::getFact, this), getXactMethod("getXact", &PlantComponent::getXact, this), inpPortF("F"),

outPortX("X")

this->methods()->addMethod(&getFactMethod, "Get the driving force.");

this->methods()->addMethod(&getXactMethod, "Get the position.");

this->ports()->addPort(&inpPortF, "This port reads input force." );

this->ports()->addPort(&outPortX," This port writes output position." );

As can be seen in the code, methods, ports and hooks have been declared. This elements are going to be explained in a more detailed way in the next chapters.

Method

Methods are going to be used for reading or writing data for a ports. In the header file, the code looks like:

public virtual double getFact();

public Method<double(void)> getFactMethod;

private double Fact;

The source code will be:

double PlantComponent::getFact() {

return Fact;

}

Data Ports

Data Ports are the variables that are going to be used to connect to the rest of the components.

Two different Data Ports will be used, the ones that will read data and the ones that will write data. These two different ports can be declared this way.

ReadDataPort<double> inpPortF; /!< Input port Force /

WriteDataPort<double> outPortX; /!< Output port Position /

Once they have been declared, using them to read/write data is done in a easy way. For example if the data of the Data Port inpPortF is to be saved in the Fact variable:

(19)

inpPortF.data()->Get(Fact);

And if the data of the variable x is to be written in the Data Port outPortX, this can be done:

outPortX.data()->Set(x);

The connection between different components is going to be explained in chapter 3.5.1.

Hooks

The hooks are the different states the component can be stay. Three different hooks can be configured: startHook, updateHook and stopHook. They are declared in this way:

bool startHook(); /!< Start task situation execution hook /

void updateHook(); /!< Periodic called hook /

void stopHook(); /!< End task situation execution hook /

The startHook is the part of the code that is going to be executed once the main program gives the order start execution. This part of the code is executed just once. It is used for being sure that the ports are correctly connected and initialize the variables that are going to be used. The plant component’s startHook looks like:

bool PlantComponent::startHook() {

if ( ! inpPortF.connected() || ! outPortX.connected() ) {

Logger::log() << Logger::Error << "Not all ports were properly connected.

Aborting."<<Logger::endl;

if ( !inpPortF.connected() )

Logger::log() << inpPortF.getName() << " not connected."<<Logger::endl;

if ( !outPortX.connected() )

Logger::log() << outPortX.getName() << " not connected."<<Logger::endl;

return false;

}

samplePeriod=getPeriod();

return true;

}

The updateHook is the part of the code that is going to be running while the main program is running. The update can be done in a periodic or an aperiodic way. The plant component updateHook:

void PlantComponent::updateHook() {

inpPortF.data()->Get(Fact);

x=(samplePeriodsamplePeriod/(2.0MOTOR_MASS))...;

x_prev2=x_prev;

x_prev=x;

Fin_prev2=Fin_prev;

(20)

Fin_prev=Fact;

outPortX.data()->Set(x);

}

This part of the code takes the input, calculates the position according to the difference equation of the plant ( 5.6.2.), updates the states and writes the output.

The stopHook is the part that will be used for cleaning up dynamic variables that been used in the code.

3.4 Reporting data

One of the most important parts of the main code is the one that creates a reporter. Using this, the behaviour of every component can be analyzed individually. Analyzing the system in detail will help in the election of one or another controller. Adding to this, the reporting data has been used to make sure that the software is working properly, for this purpose the obtained data has been compared with Matlab.

3.4.1 Configuration

Like other components, the reporting component can be configured using an XML-based spec- ification file named Reporting.cpf. Thanks to this file the way of obtaining data can be configured. The used code can be:

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE properties SYSTEM "cpf.dtd">

<properties>

<simple name="AutoTrigger" type="boolean">

<description>When set to 1, the data is taken upon each update(), otherwise, the data is only taken when the user invokes

’snapshot()’.</description><value>1</value></simple>

<simple name="Configuration" type="string">

<description>The name of the property file which lists what is to be reported.</description><value>config.cpf</value></simple>

<simple name="WriteHeader" type="boolean">

<description>Set to true to start each report with a header.

</description><value>1</value></simple>

<simple name="Decompose" type="boolean">

<description>Set to true to decompose data ports.

</description><value>1</value></simple>

<simple name="ReportFile" type="string">

<description>Location on disc to store the reports.

</description><value>logPorts.dat</value></simple>

</properties>

Three important points need to be discussed. The first one is the AutoTrigger, using this one;

data is recorded every update loop. The second one is the Configuration parameter, thanks to

(21)

this line, the reporting file knows where to look for the configuration file. This file can be used to configure which ports are going to be recorded. The third on is the ReportFile and this one will define where the recorded data is going to be stored.

As it has been said before, the Configuration file will indicate which data has to be recorded.

The config.cpf file that has been used is:

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE properties SYSTEM "cpf.dtd">

<properties>

<simple name="Port" type="string"><value>controller.Ref</value></simple>

<simple name="Port" type="string"><value>plant.X</value></simple>

<simple name="Port" type="string"><value>controller.Ukfb</value></simple>

<simple name="Port" type="string"><value>learningQ1.Ukq1</value></simple>

</properties>

The Reporting.cpf and config.cpf are stored in the same directory as the main.cpp program.

3.5 Main program

The main program is the brain of the system. It is the one that configures the characteristics of the execution of all the component, such as, the connection of different components or the execution speed.

3.5.1 Connect components

In section 3.3.3, it has been analyzed how to create ports, read data and write data. Once that is done, connection the ports have done using this template:

//8: plant->connectPorts(forceactuator);

tmpOut=forceactuator->ports()->getPort("F");

assert(tmpOut!=NULL);

tmpIn=plant->ports()->getPort("F");

assert(tmpIn!=NULL);

forceactuator->ports()->getPort("F")->connectTo(plant->ports()->getPort("F"));

Where forceactuator and plant would be components declared in this way:

PlantComponent *plant = NULL;

plant=new PlantComponent("plant");

When you use assert(), you give it an argument that is an expression you are "asserting to be true." In debug compilation mode the preprocessor generates code that will test the assertion.

If the assertion is not true, the program will stop after issuing an error message telling you what the assertion was and that it failed. (8).

(22)

Number 8 means the connection number of the "wire". A map of the connections has been drawn, this way it is easier not to get lost in the connections of all the ports.

There is one point that you have to be careful, it is the one of thinking that a write data port can be used by two read data ports at the same time; this cannot be done, so the scheme of 3.4 will give and error in the startHook state.

Component 3 Input

Component 2 Input

Component 1 Output

FIGURE3.4 - Port Bifurcation.

One of the possible workarounds is the creation of two identical ports, that output the same data. This way there would be no bifurcation and the system will not give any error. Figure 3.5 shows the scheme. In this case, Output1 and Output2 have the same data.

Component 3 Input

Component 2 Input

Component 1 Output 1 Output 2

FIGURE3.5 - Non port Bifurcation.

3.5.2 (A)Periodic activities

One of the strong points of Orocos is that it is able of working with threads at different speeds in a periodic or an aperiodic way. This is really useful when you have different components that work in different speeds (some components have to just add to values and others have to call to many functions). The way of working with periodic activities is:

PeriodicActivity periodicPlantTask(OS::HighestPriority,period_time, plant->engine());

This way you can put different priorities to the components or update speed.

(23)

The configuration of aperiodic tasks will be explained in LFFC chapter.

3.5.3 Reporting Start

Once the Reporting.cpf and config.cpf are correct, they can be called using:

FileReporting reporter ("Reporting");

Peers have to be added to the component that are going to be recorded:

reporter.addPeer (plant);

...

And the file can be loaded and started:

reporter.load();

reporter.start();

3.5.4 Start components

Once the periodic activities have been defined, the components can be started using:

plant->start();

3.5.5 Stop components and cleaning up

Once the control task has finished, the program has to stop adequately. In that state memory has to be free. This can be done:

plant->stop();

if (plant != NULL) {

delete plant;

plant = NULL;

}

All the code can be found in appendix D.

(24)

4 DemoLin

The DemoLin is a setup with an Ironless linear motor. On top of the motor another mass is attached. This is the end effector mass. Between the motor mass and end effector mass a limited stiffness is applied in the form of two leaf springs. The aim of working with this system is to achieve experience with OROCOS. We are going to make a simulation of the behaviour of the plant when the LM is controlled by a PD controller and a lowpass filter. The linear motor can be seen in figure 4.1.

FIGURE4.1 - DemoLin photo

4.2 Mathematical model

Traditionally, the design of a controller is often based on an explicit plant description in terms of a mathematical model. The mathematical model, which describes the dynamical behavior of the plant, is firstly developed, and then model-based control design techniques are applied to design appropriate controllers. The mathematical model of the system must be "simple enough" so that it can be analyzed with available mathematical techniques, and "‘accurate enough" to describe the important aspects of the relevant dynamical behavior (7, 15, 19). It is viable to consider the mechanical subsystem as a moving mass and the actuator as a source of force (in order to be a linear model) (16). The used model can be seen in the figure 4.2, an approximation of the linear motor system.

(25)

m f(t) x(t)

FIGURE4.2 - Model of DemoLin.

Using Newton’s second law and Laplace’s transform, the transfer function of DemoLin can be derived.

X~F = m ·~a f (t ) = m · ¨x(t) L© f (t)ª = L {m · ¨x(t)}

F (s) = m · (s²· X (s) − s · x(0) − ˙x(0)) F (s) = m · s²· X (s)

G(s) =X (s) F (s) G(s) = 1

m · s²

The linear motor can also be modelled adding the non-linear term of the Coulomb friction.

Even though, this effect can be linearized, and obtain a new transfer function. If a non-linear viscosity effect is added, a non-linear term like Fc = dc· tanh(1000 · ˙x) must be added. If this term is linearized Fc= d · ˙x has to be added (5). If Coulomb friction is included to the system’s, transfer function would change to:

G(s) = 1/m s²+ d±

m · s

= 1/m

s ·³ s + d±

m

´

The DemoLin can be modelled as a higher order systems, like a 4^{t h} or 6^{t h} (if we compared with the MeDe5); but this type of model would complicate the analysis that will do in next chapters. As said before: System must be "simple enough" so that it can be analyzed with available mathematical techniques, and "‘accurate enough" to describe the important aspects of the relevant dynamical behavior.

More information about DemoLin can be found in appendix B.

(26)

5 Feedback controller

5.1 PID controller

Throughout history, there have been different control algorithms; but the most important one is the known as the PID (1). PID controllers have been well known since 1942 (Ziegler-Nichols).

Their robustness has been proven for more than 60 years and nowadays they are the most used controllers all over the world. Although that it has been many years since they were invented, this type of controller keeps on developing and improving (2), such as, auto tuning techniques (12).

The parallel control law is:

MV(t) = Pout+Iout+Dout

where Pout, Iout and Doutare the contributions to the the manipulated variable from the PID controller.

The control law can be written as :

u(t ) = Kp· e(t ) + Ki· Z

e(τ) · dτ + Kd·d e(t ) d t The block diagram can be seen in figure 5.1

FIGURE5.1 - Parallel PID block diagram.

(27)

The parallel form can also be rewritten the equation in an another way:

u(t ) = Kp· (e(t ) + 1 Ti·

Z

e(τ) · dτ + Td·d e(t ) d t ) If the Laplace’s transformation is applied to the controller:

u(t ) = Kp· (e(t ) + 1 Ti·

Z

e(τ) · dτ + Td·d e(t ) d t ) u(t ) = Kp· ((Xr e f(t ) − Xm(t )) + 1

Ti · Z

(Xr e f(τ) − Xm(τ)) · dτ + Td·d (Xr e f(t ) − Xm(t ))

d t )

L {u(t )} = L

½

Kp· ((Xr e f(t ) − Xm(t )) + 1 Ti ·

Z

(Xr e f(τ) − Xm(τ)) · dτ + Td·d (Xr e f(t ) − Xm(t ))

d t )

¾

U (s) = Kp· ((Xr e f(s) − Xm(s)) + 1

T_i· s· (Xr e f(s) − Xm(s)) + Td· s · (Xr e f(s) − Xm(s))

5.2 Block Diagram

If a continuous PID controller is going to be simulated, the system will look like figure 5.2.

xref e u I F a v x

xm

[N/A]

Km

[A/V]

Amp Set Point

Sensor KS Saturation

PD Controller PD

Output 1

s 1

s Filter

LP 1/m

FIGURE5.2 - Continuous block diagram

In the figure 5.2 it can be seen that the actuator and the sensor are characterized as gains. The main target of the LM example is to achieve experience with the software trying to develop a model. Sensors are often modelled as a first order system with a time delay. It has been decided that the behaviour of the sensor is going to be modelled as a gain (constant and delay times are inconsiderable).

5.3 Tuning Controller

One of the most important features of the PID is their versatility. The control law of a PID can be changed to a P, PD, PI, PI-D or I-PD. According to the needs of the plant, different structures can be recommended. It has supposed to have a second order system, with the two poles in the origin. Adding an extra pole in the origin would make it too slow. The decided solution has been a PD controller.

The transfer function of the used controller would be:

U (s) =K_c α · s +_T¹

s +_α·T¹ · E(s) whereα < 1

The tuning of the controller can be seen appendix B.

The calculated PD law would be:

K = 40

(28)

U (s) = K · 13.9282 · s + 9.3782 s + 130.6218· E(s)

Changing K, the bandwidth of the system would change. K = 40 have been decided to obtain a bandwidth between 30 and 40 r ad · s⁻¹

5.4 Low-pass filter

The aim of a low-pass filter is to gain more high frequency role-off. It is often applied for the suppresion of sensor-noise and/or safety against the excitation of higher order dynamics.

0 0.5 1 1.5 2 2.5 3

0 0.2 0.4 0.6 0.8 1

Butterworth

0 0.5 1 1.5 2 2.5 3

0 0.2 0.4 0.6 0.8 1

Chebyshev 1

0 0.5 1 1.5 2 2.5 3

0 0.2 0.4 0.6 0.8 1

Chebyshev 2

0 0.5 1 1.5 2 2.5 3

0 0.2 0.4 0.6 0.8 1

Elliptic

FIGURE5.3 - Comparisson among different analog filters.

Different types of filters can be considered. Figure 5.3 can be seen the most important analog filters.

In this simulation a second order Butterworth filter has been used. This way, the controller would have 1 pole and 1 zero from the PD part and 2 poles from the filter part. The Bode plot of the whole controller can be seen in figure 5.4.

Once the Controller has been applied the open loop system will look like figure 5.5.

5.5 Set point

The derivative part can become a problem if a step set point is used (11). The derivative of a step setpoint is infinite in theory, and infinity is not a pleasant number. Adding to this, big changes in the setpoint create a high value in the control signal and overshoot. Instead of the step reference, a new reference has been obtained using a PathGenerator. The applied setpoint can be checked in figure 5.7.

(29)

-100 -80 -60 -40 -20 0 20 40 60 80 100

Magnitude (dB)

10^-1 10⁰ 10¹ 10² 10³ 10⁴

-180 -90 0 90

Phase (deg)

Bode Diagram

Gm = Inf dB (at Inf rad/sec) , Pm = 6.38 deg (at 2.36e+003 rad/sec)

Frequency (rad/sec) PD Controller

Low pass Butterworth Filter Controller: PD and Filter

FIGURE5.4 - Frequency response of the controller.

-200 -150 -100 -50 0 50 100

Magnitude (dB)

10^-1 10⁰ 10¹ 10² 10³ 10⁴

-360 -270 -180 -90

Phase (deg)

Bode Diagram

Gm = 8 dB (at 62.9 rad/sec) , Pm = 34.7 deg (at 30.1 rad/sec)

Frequency (rad/sec)

FIGURE5.5 - Open loop system frequency response.

(30)

0 0.5 1 1.5 2 2.5 3 3.5 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Time[s]

Reference position [m]

Different references

FIGURE5.6 - Applied reference to the system.

0 0.5 1 1.5 2 2.5 3 3.5

-4 -3 -2 -1 0 1 2 3

4x 10^-3 Acceleration of the set point

Time [s]

Acceleration [m/s2]

FIGURE5.7 - Applied acceleration setpoint to the system.

(31)

5.6 Discrete system

5.6.1 Discretization

For the implementation of the system, a discrete version has to be used of both the simulated plant and the controller. Depending on the control system that is going to be implemented, different variation can be used. In figure 5.8 can be seen how to implement the controller.

xref e u I F a v x

xm [N/A]

Km [A/V]

Amp Set Point

Sensor KS Saturation

PID Controller PID

Output 1

s 1

1/m s

FIGURE5.8 - Discrete block diagram.

The sampling period is not a factor to really worry about. A computer is supposed to be the controller, so the operation time is not a problem for this type of basic control.

5.6.2 Discrete plant

For the simulation, a plant’s difference equation system is needed.

G(z) = Z {ZOH ·G(s)} = Z½ 1 − e^−T^s^·s

s · 1

m · s²

¾

G(z) = (1 − z⁻¹) · 1 m· Z½ 1

s³

¾

= 1 m·z − 1

z ·Ts· z · (z + 1) 2 · (z − 1)³

G(z) = 1

m·T_s²· (z + 1)

2 · (z − 1)² = X (z)

F (z)=X (z) U (z) T_s²

2 · m = K X (z)

F (z)= K · z + 1

(z − 1)²= K · z + 1

z²− 2 · z + 1= K · z⁻¹+ z⁻² 1 − 2 · z⁻¹+ z⁻² X (z) · (1 − 2 · z⁻¹+ z⁻²) = F (z) · K · (z⁻¹+ z⁻²) X (z) − 2 · X (z) · z⁻¹+ X (z) · z⁻²= K · (F (z) · z⁻¹+ F (z) · z⁻²)

xk− 2 · xk−1+ xk−2= K · ( fk−1+ fk−2) xk= K · (u_k−1+ u_k−2) + 2 · f_k−1− f_k−2

x_k= T_s²

2 · m· (u_k−1+ u_k−2) + 2 · f_k−1− f_k−2 This model has been implemented in the plantComponent code.

(32)

5.6.3 Discrete controller

Sections 5.3 and 5.4 have given the PD controller and the filter that are going to be applied.

PD Controller:

P D(s) = 557.128 · s + 9.3782 s + 130.6218 Filter:

F (s) = 10000 s²+ 141.4 · s + 10000

Discretization has been done using Matlab’s c2d command, using Tustin approximation and a sample period of 0.1 ms. The obtained result would be:

D(z) = Z{PD(s) · F (s)}

D(z) = 0.01375 ·z³+ 1.0009 · z²− 0.9981 · z − 0.9991 z³− 2.9729 · z²+ 2.9460 · z − 0.9732

This controller has poles and the zeros inside the unit circle. Zeros: 0.9991,-1 and -1.

Poles: 0.9929 ± 0.0070 · i and 0.9870.

For a better explanation, the root locus diagrams of the system be shown in figure 5.9.

z = e^T^s^·s Ts→ 0 ⇒ z → 1

-150 -100 -50 0 50

-100 -80 -60 -40 -20 0 20 40 60 80 100

Root Locus

Real Axis

Imaginary Axis

FIGURE5.9 - Continuous root locus diagram of the open loop system, including: controller, filter, actuator and plant.

(33)

5.6.4 Actuator and sensor

The actuator and the sensor have been discretized in the same way. These two systems are modelled as gains; so their discrete form is going to be a constant value.

5.7 Implementation in Orocos

Once all the elements of the feedback control system have been defined, it is time to implement them in the framework. As mentioned in chapter 2 every single agent is going to be im- plemented individually. The word component has been used to define the agents. Component, task and agent can be defined in the same way, an entity that solves a problem. Orocos builder manual uses the component word to define it.

Figure 5.10 shows the connection map. As mentioned in the section 3.5.1, having a connection map can be useful, because of 2 reasons:

1 The connection between components has to be done carefully. The names of each port in every component has to be correct (that is why an assert instruction was used, avoiding mis- spelling is something important).

2 Do not get lost. It can sound funny, but when a component has many input/output ports;

having a map can be really useful. Knowing which output port is connected with which input port can avoid future problems. Adding to this, the way it has been programmed makes it easy to connect and disconnect ports.

1 2

7 8 9

Sensor sensorComponent

Real position (X) Measured position (Xmfb)

Path Generator generatorComponent

Setpoint (Reffb)

Feedback Controller feedbackControllerComponent Setpoint (Ref)

Measured output (SensPos)

Feedback Control Signal (Ukfb)

End effector plantComponent

Force (F) Real position (X)

Actuator actuatorComponent

Control Signal (Uk) Force (F)

FIGURE5.10 - Connection Map of the feedback controller.

The maps can be read in a easy way.

1 Every block represents a component (or agent), and the input and the outputs of the block represent the inputs and the outputs that have been configured in Orocos.

2 The block has two names. One is the generic name, showing which element it represents.

The second name shows the component’s Code::Blocks project name.

3 The ports have also two names. The first name represent what identifies the port, and the second one (using parentheses) shows the ports name.

4 Every "wire" between components has a number. This number shows which ports are connected and helps to make sure the connection has been done (programming in the correct order makes linking easier).

Once all the components have been successfully connected and the reporting files are configured in the correct way, simulations can begin.

(34)

5.8 Simulation results

The simulations have been done using Matlab and Orocos, and comparing the results obtained with each software.

Matlab was created in 1984, and it is known as one of the most powerful and reliable softwares for doing simulations. Orocos was created in December 2000, and is still developing.

First of all, the system behaviour have been analysed. The response of the system with a con- crete setpoint will show if the controller has been tuned in a correct way. Figure 5.11 shows the behaviour of the system simualted with Orocos.

0 0.5 1 1.5 2 2.5 3 3.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Time[s]

Position [m]

Response of the simulated system using Orocos

Reference Output

FIGURE5.11 - Setpoint and output comparisson in the simulated system.

As can be seen in figure 5.11 and 5.12, the system behaves in an acceptable way, the maximum error of the system goes from -0.01 to 0.01 meters while the reference is been applied.

Another important aspect of the simulation is the obtained control signal. In figure 5.13 comparisson is shown between the Matlab simulation and the one done with Orocos.

The output of the plant has also been compared. As the difference is so small, root mean square deviation has been decided for watching the difference. This can be seen in 5.14

(35)

0 0.5 1 1.5 2 2.5 3 3.5 -0.01

-0.005 0 0.005 0.01

Time[s]

Error [m]

Simulation of the obtained error

FIGURE5.12 - Error of the simulated system.

0 0.5 1 1.5 2 2.5 3 3.5

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

Time[s]

Control signal [V]

Calculated control signal

Matlab Orocos

FIGURE5.13 - Control Signal of the simulated system.

(36)

0 0.5 1 1.5 2 2.5 3 3.5 0

0.5 1 1.5 2 2.5x 10^-3

Time[s]

RMSD of the output of the system [m]

Root mean square deviation between the simulated outputs

FIGURE5.14 - RMSD between simulated outputs.

(37)

(38)

6 Learning Feed-Forward controller

A learning controller is a control system that comprises a function approximator of which the input-output mapping is adapted during control, in such way that a desired behaviour of the controlled system is obtained. (22)

As was started in chapter 4, the mathematical model of the system must be "simple enough"

so that it can be analyzed with available mathematical techniques, and "‘accurate enough" to describe the important aspects of the relevant dynamical behavior. That is why a moving mass has been taken as a model of simulation. Another factor is that it is known that electromechan- ical motion systems have reproducible disturbances (such as cogging and friction). The main purpose of using a LFFC is to eliminate positional inaccuracy due to reproducible disturbances and model uncertainty.

6.2 LFFC using B-spline Neural Networks 6.2.1 LFFC structures

We consider a controller structure that consists of a feedback and a feed-forward controller.

We assume that the state of the process and the state of the reference model are identical and use the approximated inverse dynamics of the process to compute the feed-forward signal. For proper reference signals and when there are no disturbances, if the feed-forward controller equals the inverse of the plant, the tracking error will be zero. The feedback controller is de- signed such that robust stability is guaranteed in the presence of model uncertainty, while the feed-forward controller is used to compensate for known reproducible disturbances. (5) There are two types of LFFC structures.

• Time Index LFFC: This type of structure is the easiest way of implementing the LFFC.

This type of structure can be used for repetitive motions. The main idea of this method is that it learns the control signal that would be needed like a time function (that is why it is used in repetitive motions). Figure 6.1 show the block diagram of the system.

r e u y

Time

Setpoint

Feed-forward Controller

Sensor

Plant Actuator

Feedback Controller

FIGURE6.1 - Time Index block diagram

The way of how it learns the control signal is going to be explained in subsection 6.2.2.

(39)

• State Indexed LFFC: This type of structure is more complex than the Time Index. It can be said that this type of structure is the general form of the LFFC. State Index can be used in repetitive and non repetitive systems. The main idea of this method is that it learns the control signal that would be needed as a function of the states of the systems. So it tries to learn the model uncertainties as function of setpoints states.

Let’s explain it using an example. Suppose that we are working with a system like the one that can be seen in figure 6.2

FIGURE6.2 - A linear process with a non-linearity

The state vector is chosen such that it consists of positions (x2) and their corresponding velocities (x1) .

· x˙1

˙ x2

¸

=

· A11

I

A12

0

¸

·

· x1

x2

¸ +

· B1

0

¸

· u

Let’s suppose that the process has both velocity and position dependent non-linearities.

This would lead to:

· x˙1

˙ x₂

¸

=

· A11

I

A12

0

¸

·

· x1

x₂

¸ +

· h1(x1) + h2(x2) 0

¸ +

· B1

0

¸

· u The desired control signal for a desired position would be:

u_d= B1⁻¹· ( ¨x_2,d− A11· ˙x_2,d− A12· x2,d− h1( ˙x2) − h2(x2)) As can be seen, the perfect control signal is a function of the plant states.

In theory, the LFFC is able to identy the model uncertainties with the space states; so it would be able to relate the cogging effect with position and friction with velocity for example. So, for a proper learning the setpoints states would be needed, as can be seen in figure 6.3.

The block diagram can be checked in 6.3.

(40)

r e u y Setpoint

Feed-forward Controller

Sensor

Plant Actuator

Feedback Controller du/dt

du/dt

FIGURE6.3 - State Index block diagram

In this project Time-index system has been used. The main reason is that from the computational point of view is not more complex to program a state index program, but it takes more time to calculate the algorithm; mainly for two reasons, the first would be that more data is needed (in the time index it has relate control signal to time, in state index it has be relate control signal to the states) and the second would be that more functions should be called, leading to a bigger computational time (this will discussed in this chapter).

6.2.2 LFFC learning

Once the different structures have been briefly explained, we are going to explain how the B- spline NN learns the control signal. The B-spline neural network of order N consists of an addition of piece-wise polynomial functions of order n-1.

y(x) =

N

X

i =1

ωi· µi(x)

• y: The output of the system that is going to be learned, so the learned signal.

• x: The input of the system. If it is time index, x will be time; if it state index, x will be the states.

• N: Number of B-splines that will be used.

• ω: Weight of the the memberships.

• µ: The membership of the function. The membership is defined by the order of the poly- nomial that will be in charge of approximating the system. Figure 6.4, 6.5, 6.6 show the most common memberships that are used in control enginneering.

The training of the neural network can be done in on-line or off-line mode. In the on-line case, the cost function J is minimized by squared approximation error between the desired output of the BSN y_dand the actual output y:

J =1

2· (yd− y)² The update of the weights yields:

∆ωi= γ · (yd− y) · µi(x) were gamma would be the learning rate. 0<γ<1

(41)

FIGURE6.4 - 1^storder B-spline

FIGURE6.5 - 2^ndorder B-spline

FIGURE6.6 - 3^{r d}order B-spline In the off-line training mode, the BSN tries to minimize all the data:

J =1 2·X

j

(yd , j− yj)²

and the weights variation follow the next formula:

∆ωi= γ ·

Pj(y_{d , j}− yj) · µi(x) P

jµⁿ_i(xj)

The project has been programmed to work in off-line mode as will be explained in section 6.4.

To make sure that the learning concept has been understood, an example will be shown in figure 6.7. It wants to learn the continuos uf f signal using 7 B-splines of 2^nd order. uf f is a time function, and hence in this case x would equal time. Figure 6.4 shows how the continuos u_{f f} has been approximated by the dotted function.

A obvious conclusion can be obtained from this example. Having more B-splines and having higher order B-spline will lead to obtain smaller error; but it cannot be forgotten that working with more "heavy" learning systems will cause high computational cost. Apart from that, a too high number of B-splines will lead to divergence of the learning process (instability).

(42)

FIGURE6.7 - Mapping example 6.3 Programming the B-spline: GSL libraries

Programming the neural network can become a complex task, therefore an alternative has been chosen, the GSL libraries (17). The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite. Using those functions in an appropiate way will help to obtain a good feed-forward control signal.

6.4 Implementation in Orocos

In this section we are going to explain why the feed-forward is different from the rest of the components and how it has been implemented.

1 Used libraries: When the GSL libraries are used, the project has to look for the Orocos libraries and the GSL libraries. For doing this, first of all the libraries have to be installed. The libraries can be found in the GSL’s project main page. After installing them, the project can be ready to know where to look for them. In chapter 3 is explained how to configure Orocos for looking the libraries (steps 15 and 16). Those two lines have to be changed to these ones:

Learning Multi-Agent Control with OROCOS

University of Twente

EEMCS / Electrical Engineering

Control Engineering

Learning Multi-Agent Control with OROCOS

Iker Rezola

MSc report

Iker Rezola

Summary

Contents

List of Figures

1 Introduction

2 Agents and Multi-Agent Systems

3 Orocos

Key: PKG_CONFIG_PATH

Value: $PKG_CONFIG_PATH:/usr/local/lib/pkgconfig

‘pkg-config --errors-to-stdout orocos-ocl-gnulinux orocos-rtt-gnulinux --cflags‘

‘pkg-config orocos-ocl-gnulinux orocos-rtt-gnulinux --libs‘

class PlantComponent : public RTT::TaskContext {

public:

/*!* Constructor (default)*/

PlantComponent(std::string name);

/*!* Default destructor.*/

~PlantComponent();

virtual double getFact();

virtual double getXact();

Method<double(void)> getFactMethod;

Method<double(void)> getXactMethod;

private:

ReadDataPort<double> inpPortF; /*!< Input port Force */

WriteDataPort<double> outPortX; /*!< Output port Position */

double x; /*!< Variables>

...

bool startHook(); /*!< Start task situation execution hook */

void updateHook(); /*!< Periodic called hook */

void stopHook(); /*!< End task situation execution hook */

};

PlantComponent::PlantComponent(std::string name):RTT::TaskContext(name), getFactMethod("getFact", &PlantComponent::getFact, this), getXactMethod("getXact", &PlantComponent::getXact, this), inpPortF("F"),

outPortX("X")

this->methods()->addMethod(&getFactMethod, "Get the driving force.");

this->methods()->addMethod(&getXactMethod, "Get the position.");

this->ports()->addPort(&inpPortF, "This port reads input force." );

this->ports()->addPort(&outPortX," This port writes output position." );

public virtual double getFact();

public Method<double(void)> getFactMethod;

private double Fact;

double PlantComponent::getFact() {

return Fact;

}

ReadDataPort<double> inpPortF; /*!< Input port Force */

WriteDataPort<double> outPortX; /*!< Output port Position */

inpPortF.data()->Get(Fact);

outPortX.data()->Set(x);

bool startHook(); /*!< Start task situation execution hook */

void updateHook(); /*!< Periodic called hook */

void stopHook(); /*!< End task situation execution hook */

bool PlantComponent::startHook() {

if ( ! inpPortF.connected() || ! outPortX.connected() ) {

Logger::log() << Logger::Error << "Not all ports were properly connected.

Aborting."<<Logger::endl;

if ( !inpPortF.connected() )

Logger::log() << inpPortF.getName() << " not connected."<<Logger::endl;

if ( !outPortX.connected() )

Logger::log() << outPortX.getName() << " not connected."<<Logger::endl;

return false;

}

samplePeriod=getPeriod();

return true;

}

void PlantComponent::updateHook() {

inpPortF.data()->Get(Fact);

x=(samplePeriod*samplePeriod/(2.0*MOTOR_MASS))...;

x_prev2=x_prev;

x_prev=x;

Fin_prev2=Fin_prev;

Fin_prev=Fact;

outPortX.data()->Set(x);

}

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE properties SYSTEM "cpf.dtd">

/! Constructor (default)*/

/! Default destructor.*/

ReadDataPort<double> inpPortF; /!< Input port Force /

WriteDataPort<double> outPortX; /!< Output port Position /

bool startHook(); /!< Start task situation execution hook /

void updateHook(); /!< Periodic called hook /

void stopHook(); /!< End task situation execution hook /

ReadDataPort<double> inpPortF; /!< Input port Force /

WriteDataPort<double> outPortX; /!< Output port Position /

bool startHook(); /!< Start task situation execution hook /

void updateHook(); /!< Periodic called hook /

void stopHook(); /!< End task situation execution hook /

x=(samplePeriodsamplePeriod/(2.0MOTOR_MASS))...;