Mathematics for Databases

(1)

Fontys Information Technology

Mathematics for Databases

Supplement for EDB2 and EDB3

(2)

Author: Sander van Laar Version: 2.2 (April 2007)

(3)

(4)

1 Introduction

A formal description of a database consists of the description of the relations (tables) of the database together with the constraints that must hold on the database. Furthermore the contents of a database can be retrieved using queries. These constraints and queries for databases can very well be formalized. A formal description of a constraint or a query is necessary to describe the constraint or query unambiguously. In other words, a formal description leads to one and only one meaning of the constraint or query.

To describe constraints and queries in a formal way we use predicate logic, set theory and tuple relational calculus. The tuple relational calculus is a calculus based on the use of tuple variables. A tuple variable is a variable that ranges over a named relation (i.e. a set of tuples of a relation). This paper describes the use of the relational calculus for databases. A description of the formal notation is given as well as a mapping of these expressions to SQL.

This paper uses the mathematical notation as used in the book ‘Database Systems’ by Connolly and Begg.

2 Predicates

A predicate is a function that maps each element of a set to the value ‘true’ or ‘false’.

Example

Assume x  N, x is an element of the natural numbers (0, 1, 2,…). Then the following are predicates. P1(x)  x  0 (is true for all x)

P2(x)  x < 6 (is true for x = 0, 1, 2, 3, 4, 5)

P3(x)  x < 7  x > 10 (is false for all x)

P4(x)  x mod 2 = 0  x2 < 40 (is true for x = 0, 2, 4, 6)

□

Predicates can consist of one expression like ‘x  0’ or a combination of expressions. These combinations can be made by combining expressions with the logical operators ‘~’ (negation), ‘’ (conjunction), ‘’

(disjunction) and ‘’ (implication).

Examples

(1) ~ (x < 6) (meaning: x  6) (2) x < 7  x > 10 (3) x < 5  x  6 (meaning: x  5) (4) x  N  x < 6 □

Predicates can be used to define sets as well. The following examples define sets by using predicates.

Examples

(1) S1 = { x | x  N  P1(x) } = N

(2) S2 = { x | x  N  P2(x) } = {0, 1, …, 5}

(3) S3 = { x | x  N  P3(x) } =  (the empty set)

(4) S4 = { x | x  N  P4(x) } = {0, 2, 4, 6}

(7)

3 Quantifiers

There are two quantifiers that can be used in an expression to find out whether a proposition is true for all tuples that match a certain condition or whether a proposition is true for at least one tuple. These quantifiers are called respectively the Universal Quantifier (notation:  ) and the Existential Quantifier (notation:  ). Let P(x) be a predicate that for a given x yields true or false. To find out if for all tuples in the domain D(x) (the set of all possible values for x) the predicate P(x) is true we write

( x )( D(x)  P(x) ), (‘for all x for which D(x) holds, P(x) is true’)

To find out if for at least one tuple from the domain D(x) the predicate P(x) is true we write ( x )(D(x)  P(x) ), (‘there is an x for which D(x) holds, P(x) is true’)

The result of these expressions is either true or false.

Examples

(1) ( x )( x  N  P1(x) ) (true) (2) ( x )( x  N  P1(x) ) (true) (3) ( x )( x  N  P2(x) ) (false) (4) ( x )( x  N  P2(x) ) (true) (5) ( x )( x  N  P3(x) ) (false) (6) ( x )( x  N  P3(x) ) (false) (7) ( x )( x  N  P4(x) ) (false) (8) ( x )( x  N  P4(x) ) (true) □

Remark

Please notice the difference between ( x )( D(x)  P(x)) and ( x )( D(x)  P(x)).

For example: ( x )( x  N  x > 7  x2 > 50) (which is true) differs from ( x )( x  N  x > 7  x2 > 50) (which is false). The domain of an expression with a universal quantifier is always located to the left of the implication arrow.

4 Equations

If p and q are expressions with the value ‘true’ or ‘false’ then the following logical equations hold. (1) ~ ( ~p )  p (2) p  q  ~ p q (3) ~ (p  q)  ~p  ~q (4) ~ (p  q)  ~p  ~q (5) p  (q  r)  (p  q)  (p  r) (6) p  (q  r)  (p  q)  (p  r)

(8)

These equations can also be tested with a truth value table. An example of a truth-value table is shown below.

p q ~p p  q p  q p  q

false false true false false true false true true false true true true false false false true false true true false true true true

If P is a predicate and D1 and D2 are domains (expressions that limit the possible values for a variable) then

the following equations hold.

(7) ( x )( P(x) )  ~ ( x )( ~ P(x) ) (8) ( x )( P(x) )  ~ ( x )( ~ P(x) )

(9) ( x )( (D1(x)  D2(x))  P(x) )  ( x )( D1(x)  (D2(x)  P(x)) )

(10) ( x )( (D1(x)  D2(x))  P(x) )  ~ ( x )( D1(x)  D2(x)  ~P(x)) )

Exercise 4.1

V = {a,b,c} W = {amsterdam, best, culemborg} Z = {9,4,5,10,7} Which of the following expressions are true?

a. ( x) ( x  V  ( y )( y  W  y starts with the letter x)) b. ( x ) ( x  V  ( y )( y  W  y starts with the letter x)) c. ( y )( y  W  ( x )( x  V  y starts with the letter x)) d. ( x )( x  V  x is a vowel)

e. ( x )( x  Z  ( y )( y  Z  x + y = 14)) f. ( y )( y  Z  ( x )( x  Z  x + y ≥ 14))

g. ( x )( x  W  ( y )( y  Z  x has y different letters)) h. ( y )( y  Z  ( x )( x  W  x has y different letters))

Exercise 4.2 (*)

A = {1,2,3,4,5,6,7,8,9}

Which of the following expressions are true? a. ( x )( x  A  ( y )( y  A  x2 _{> y ))}

b. ( x )( x  A  ( y )( y  A  x2_{> y ))}

c. ( y )( y  A  ( x )( x  A  x2 _{> y ))}

(9)

Exercise 4.3

Let E be the set of European countries and C a set of cities.

Describe the following statements in terms of logic. Use the predicates capital(x,y) and is_located_in(x,y) to express respectively whether a city is the capital of a certain country or whether a city is located in a certain country.

a. Each city in C is the capital of a country from E.

b. The cities of C are all located in one and the same country. c. There is a country in which not a single city of C is located. d. Not all cities of C are located in the same country.

Exercise 4.4 (*)

Formally describe the following expressions in terms of logic. a. There are two different natural numbers of which the sum is 3. b. There are three different natural numbers of which the sum is 6.

c. For each of the natural numbers the square of that number is not negative. d. There is no natural number of which the square is a negative number.

Exercise 4.5

The set I is the set of inhabitants of the town of Oss. The set D is the set of medical doctors of Oss. We introduce the following assertions.

a. ( d )( d  D  ( i )( i  I  i is a patient of d)) b. ( d )( d  D  ( i )( i  I  i is no patient of d)) c. ( i )( i  I  ( d )( d  D  i is a patient of d)) d. ( d )( d  D  ( i )( i  I  i is a patient of d))

We assume that each inhabitant has exactly one medical doctor. This strengthens assertion c. Which conclusion follows from which assertion?

e. There is not a single medical doctor in Oss who hasn’t got any patients. f. Every inhabitant has a medical doctor.

g. Oss only has one medical doctor. h. Oss has at least two medical doctors.

(10)

5 Aggregate functions

Aggregate functions are mathematical functions on grouped information. In the relational calculus the following aggregate functions are available: COUNT, SUM, MIN, MAX and AVG. Below an overview is shown for each of these functions.

In the table below Expr(x) is an expression, D(x) a domain (the range of x), P(x) a predicate. The expression

Expr(x) can represent a number as well as a character expression.

Function Notation Result

COUNT { COUNT (x) | D(x)  P(x) } the number of not null x’s of the domain D(x) for which P(x) holds

SUM { SUM (Expr(x)) | D(x)  P(x) } the sum of all Expr(x) for which D(x) and P(x) is true MIN { MIN (Expr(x)) | D(x)  P(x) } the smallest Expr(x) for which D(x) and P(x) is true MAX { MAX (Expr(x)) | D(x)  P(x) } the greatest Expr(x) for which D(x) and P(x) is true

AVG { AVG (Expr(x)) | D(x)  P(x) } the average over all Expr(x) for which D(x) and P(x) is true In the following examples we will mainly use the following database schema consisting of two relations,

called ‘Staff’ and ‘Branch’. To express that a tuple s is a tuple of the staff table we will write ‘Staff(s)’, or equivalently, 's  Staff'. The contents of the Staff table and the Branch table represent employees from a company and the branch in which they work. The contents of the tables are shown below.

Examples

(1) { AVG (x) | 1 ≤ x ≤ 5 } = (1+2+3+4+5)/5 = 3 (2) { SUM (x2_{) | 1 ≤ x ≤ 5 } = 1}2_{+ 2}2_{+ 3}2_{+ 4}2_{+ 5}2_{= 55}

(11)

(3) The number of managers from the staff table:

{ COUNT (s) | Staff(s)  s.position = ‘Manager’ } (Answer: 2) (4) The sum of salaries of the managers from the staff table:

{ SUM (s.salary) | Staff(s)  s.position = ‘Manager’ } (Answer: 54000) (5) The average salary of all staff members:

{ AVG (s.salary) | Staff(s) } (Answer: 17000) □

Remark:

Although the previous examples (1) and (2) are theoretically seen not correct, we will allow the omission of the curly brackets (‘{‘ and ‘}’) around a single value. In the following example it should say

{ COUNT (s) | Staff(s)  s.position = ‘Manager’ } = { 2 }

but, for the sake of simplicity, we will allow the so-called singleton sets to be written as { COUNT (s) | Staff(s)  s.position = ‘Manager’ } = 2

□

Exercise 5.1 (*)

Let A be a set of numbers.

Describe the following expressions in terms of tuple relational algebra: a. The sum of the positive numbers of A is greater than 100.

b. 75 is the greatest positive number of A.

c. A contains more positive numbers than negative numbers. d. X equals the number of different numbers of A.

e. X indicates whether the largest positive number of A equals more than two times the smallest positive number of A.

(12)

6 Constraints

The first goal of the tuple relational algebra is providing a way for a formal description of constraints. Constraints can be used to make sure every stored value in a database matches a certain condition. These constraints are expressions that that can very well be formalized. There are 5 kinds of constraints: attribute, tuple, table, database and dynamic constraints.

Examples

(1) All staff persons earn at least 9000:

( s )( Staff(s)  s.salary ≥ 9000 ) (attribute constraint) (2) A person van either be male of female:

(s )( Staff(s)  s.sex  {'M', 'F'} ) (attribute constraint) (3) The salary of a manager is at least 20000:

( s )( Staff(s)  ( s.position= ‘Manager’  s.salary ≥ 20000 )) (tuple constraint) (4) Branch B007 does not have any programmers:

( s )( Staff(s)  ( s.branchNo = 'B007'  s.position  'Programmer' )) (tuple constraint) (5) There are less than 3 managers in the among the staff:

{ COUNT (s) | Staff(s)  s.position = ‘Manager’ }  2 (table constraint) (6) The sum of the salaries of the managers will not exceed 1000000 (and is a positive value):

0  { SUM (s.salary) | Staff(s)  s.position = ‘Manager’ }  1000000 (table constraint) (7) The branch for which a staff person works must exist (Foreign Key constraint):

( s )( Staff(s) 

( b )( Branch(b)  b.branchNo = s.branchNo ) ) (database constraint) (8) The branches in London have at most five assistants: (database constraint)

( b )( Branch(b)  b.city = 'London' 

{ COUNT s | Staff(s)  s.branchNo = b.branchNo  s.position = ‘Assistant' } ≤ 5 ) □

(13)

Exercise 6.1 (*)

An information analyst has determined that certain data have to be stored. Below a schematic overview of the information structure is shown.

sex memberYear street city

char(5) integer char(20) char(20) MEMBERS

memberNr name initials title birthYear

integer char(15) char(5) char(7) integer

FINES

fineNr memberNr dateIn dateOut amount

integer Integer Date Date NUMBER(7,2)

The following constraints have to be implemented .

 memberNr uniquely identifies MEMBERS  fineNr uniquely identifies FINES

 memberNr in FINES refers to memberNr in MEMBERS  birthYear  1900 and  1999 in MEMBERS

 sex in MEMBERS is either ‘Male’ or ‘Female’  memberYear  1980 and  1999 in MEMBERS  amount has two decimals in FINES

 dateIn  1980 en  1999 in FINES

Use the description of the information structure and the constraints to create the complete relational database schema, including the primary key descriptions and the foreign key descriptions.

(14)

Exercise 6.2

This exercise uses the following relational schema, which concerns orders of articles by customers. CUSTOMER( customerNr, name, address, place, credit)

customerNr uniquely identifies customers. ORDER( orderNr, customerNr, orderDate, shippingDate)

orderNr uniquely identifies orders,

customerNr refers to customerNr in CUSTOMER. ARTICLE( articleNr, articleName, color, price)

articleNr uniquely identifies articles. ORDERLINE( orderNr, articleNr, amount)

(orderNr, articleNr) uniquely identifies order lines, articleNr refers to articleNr in ARTICLE,

orderNr refers to orderNr in ORDER.

Give for each of the following constraints the type of constraint (attribute, tuple, table, or database). Formulate the constraints using the tuple relational calculus.

a. an order will not be shipped before the order date; b. the price of purple articles is always more than 100 Euro; c. (*) a customer is allowed to place at most one order per day;

d. purple articles are always more expensive than yellow articles with the same name; e. it is not possible to order non-existing articles;

f. (*) customers from Amsterdam are not allowed to order purple articles; g. a maximum of 10 different articles may be ordered in one order;

h. (*) the total price of all ordered articles for each customer has to be less than his credit; Questions marked with a (*) are (partially) worked out at the end of this paper.

(15)

7 Queries

The second goal of the tuple relational calculus is to make it possible to describe queries in a formal non-ambiguous way. Later on we will make a mapping from these formalized queries to (correct) SQL statements.

In the tuple relational calculus all queries on the database are of the form: { s | F(s) }.

In other words: the set of tuples s for which expression F(s) is true. If only some of the attributes are of any interest we write

{ s.(a1, a2, …, an) | F(s) },

with a1, a2, …, an attributes of the tuple s. This set is created by selecting all tuples s for which F(s) is true and

then by projecting these tuples on the attributes a1, a2, …, an .

The result of a question about a set of tuples can be either a set of tuples matching a certain condition (e.g. the employees (tuples) that earn more than 10000), or a value (by using aggregate functions).

Examples

(1) The name and address of all Staff members: { s.(name, address) | Staff(s) }

(2) The names of Staff members who earn more than 10000: { s.name | Staff (s)  s.salary > 10000 }

(3) The number of staff in each branch and their total salaries: { s.branchNo, COUNT(s.staffNo), SUM(s.salary) | Staff(s) } □

(16)

8 Expressions with subexpressions

All predicates are expressions that yield true or false. Since expression like { COUNT (s) | Staff(s)  s.position = ‘Manager’ }  2

and

B003  { b.branchNo | Branch(b) }

also yields true or false, it is allowed to use expressions like these in other expressions. We call these embedded expressions subexpressions or subqueries.

Examples

(1) Staff who work for the branch at 163 Main Street: { s.(staffNo, fName, lName, position) | Staff(s) 

s.branchNo = { b.branchNo | Branch(b)  b.street = ‘163 Main St.’ }

}

(2) Staff with a salary greater than the average salary:

{ s.(staffNo, fName, lName, position) | Staff(s)  s.salary > { AVG (r.salary) | Staff(r) } }

(3) The branches in which a manager works:

{ b | Branch(b)  b.branchNo  { s.branchNo | Staff(s)  s.position = ‘Manager’ } } (4) The branches in which no staff works:

{ b | Branch(b)  b.branchNo  { s.branchNo | Staff(s) } }

(5) The managers who have at least two employees in all the branches they work in: { s | Staff(s)  s.position = ‘Manager’ 

( b )( Branch(b)  b.branchNo = s.branchNo 

{ COUNT(r) | Staff(r)  r.branchNo = b.branchNo }  2 ) }

□

There are two types of subqueries: subqueries with synchronization and subqueries without synchronization. Subqueries without synchronization are subqueries that do not depend on the ‘superquery’ (the query that contains the subquery). In other words, the subquery is a query on it self; it can be answered without the superquery’s values. Examples of subqueries without synchronization can be found above: examples (1) till (4).

A subquery with synchronization is a query that uses values from the superquery. The subquery in example (5) shows a link between the sub and the superquery (r.branchNo = b.branchNo). This subquery can only be answered in relation with the values from Branch(b).

(17)

Exercise 8.1

This exercise uses the following relational schema, which concerns orders of articles by customers. CUSTOMER( customerNr, name, street, city, credit)

customerNr uniquely identifies customers. ORDER( orderNr, customerNr, orderDate, shippingDate)

orderNr uniquely identifies orders,

customerNr refers to customerNr in CUSTOMER. ARTICLE( articleNr, articleName, color, price)

articleNr uniquely identifies articles. ORDERLINE( orderNr, articleNr, amount)

(orderNr, articleNr) uniquely identifies orderlines, articleNr refers to articleNr in ARTICLE,

orderNr refers to orderNr in ORDER.

Formulate the following queries in terms of the tuple relational calculus. a. the names of customers from Amsterdam;

b. the number of orders placed after January 1st_{1991 ('01-Jan-1991');}

c. all data of customers who have placed an order after January 1st 1991;

d. the names of customers from Amsterdam, who have ordered article 10023; e. (*) the names of customers, who have ordered red articles after January 1st;

f. (*) the names of customers, who have never ordered an article of more than 100 Euro; g. the article numbers of articles that have never been ordered;

h. (*) the article numbers of articles that have never been ordered by customers from Amsterdam; i. (*) the article numbers of articles that have only been ordered by customers from Amsterdam. Questions marked with a (*) are (partially) worked out at the end of this paper.

(18)

9 Mapping expressions to SQL

After describing a query in a formal tuple relational expression the mapping to SQL is a small step. The simple expression { x | D(x)  P(x) } is mapped to SQL as SELECT DISTINCT x FROM D(x) WHERE P(x);

(The keyword ‘DISTINCT’ is used to remove duplicate values.)

Also aggregate functions can be mapped to SQL. For each F as one of the aggregate functions COUNT, SUM, MIN, MAX, AVG the expression

{ F (E(x)) | D(x)  P(x) } can be mapped to SQL as SELECT F(E(x)) FROM D(x) WHERE P(x);

Remark

If F is the aggregation function COUNT then only column names are allowed for expression E(x) (so no mathematical expressions).

Examples

(1) { s | STAFF(s)  s.position = ‘Manager’ } is written in SQL as

SELECT * FROM STAFF s

WHERE s.position = ‘Manager’;

(2) { SUM (s.salary) | STAFF(s)  s.position = ‘Manager’ } is written in SQL as

SELECT SUM(s.salary) FROM STAFF s

WHERE s.position = ‘Manager’;

(3) Complex expressions can be mapped as well.

{ s.(staffNo, fName, lName, position) | Staff(s)  s.salary > { AVG(r.salary) | Staff(r) } } maps to SQL as

(19)

SELECT s.staffNo, s.fName, s.lName, s.position FROM STAFF s

WHERE s.salary >

(SELECT AVG(r.salary) FROM STAFF r);

(4) { s.branchNo, COUNT(s.staffNo), SUM(s.salary) | Staff(s) } maps to SQL as

SELECT s.branchNo, COUNT(s.staffNo), SUM(s.salary) FROM STAFF s

GROUP BY s.staffNo;

Notice the ‘GROUP BY’ in example (4). ‘GROUP BY’ is required since we are interested in the number of staff and the sum of the salaries per branch.

□

Mapping Quantifiers to SQL

An expression containing an existential quantifier can be mapped to SQL using the keyword ‘EXISTS’. For the universal quantifier no specific SQL keyword is available. Therefore an expression containing a  quantifier has to be rewritten to an expression with only the  quantifier before a mapping can be made. We will use the following equation to rewrite  to  (for other useful equations see chapter 4):

(7) ( x )( P(x) )  ~( x )( ~ P(x) )

Example

{ s | Staff(s)  s.position = ‘Manager’ 

(b )( Branch(b)  b.branchNo = s.branchNo 

{ COUNT(r) | Staff(r)  r.branchNo = b.branchNo }  2 ) }

 [ equation rule (7) ]

~ ( b )( ~ ( Branch(b)  b.branchNo = s.branchNo 

{ COUNT(r) | Staff(r)  r.branchNo = b.branchNo }  2 )) }

~ ( b )( ~(~ ( Branch(b)  b.branchNo = s.branchNo) 

~ ( b )( ~( ~Branch(b)  b.branchNo  s.branchNo 

(20)

Which maps to SQL as SELECT * FROM STAFF s

WHERE s.position = ‘Manager’ AND NOT EXISTS

(SELECT * FROM BRANCH b

WHERE b.branchNo = s.branchNo AND 2 > (SELECT COUNT(*)

FROM STAFF r

WHERE r.branchNo = b.branchNo) );

□

Exercise 9.1

The following part of a data model of a hospital is given.

CREATE TABLE P {Patient}

(PNR NUMBER(5) NOT NULL, {Patient NumbeR}

PNM CHAR(20), {Patient NaMe}

PSTR CHAR(20), {Patient STReet}

PCITY CHAR(15), {Patient CITY}

DOB DATE, {Date Of Birth}

BLTP CHAR(2), {BLood TyPe}

RHF CHAR(1), {RHesus Factor}

SEX CHAR(1), {SEX}

PRIMARY KEY (PNR) );

CREATE TABLE HOSP {HOSPitalization

(PNR NUMBER(5) NOT NULL, {NumbeR of hospitalized Patient} DATEIN DATE, {hospitalization date = IN DATE} HRSN CHAR(40), {Hospitalization ReaSoN}

RNR NUMBER(2), {Room NumbeR}

DATEOUT DATE, {(planned) dismission date = DATE OUT} SNR NUMBER(2), {NumbeR hospitalizing Specialist} PRIMARY KEY (PNR,DATEIN),

FOREIGN KEY (PNR) REFERENCES P (PNR), FOREIGN KEY (SNR) REFERENCES ADM(SNR) );

CREATE TABLE SP {Specialist}

(SNR NUMBER(2) NOT NULL, {Specialist NumbeR}

SNM CHAR(19), {Specialist NaMe}

SSTR CHAR(18), {Specialist STReet}

SCITY CHAR(9), {Specialist CITY}

SPDPTNR NUMBER(2), {Specialist DePartmenT NumbeR}

NRBEDS NUMBER(2), {NumbeR of BEDS}

EMPD NUMBER(1), {EMPloyeD}

PRIMARY KEY (SNR) );

(21)

Describe the following questions directly in SQL:

<simple queries; Book section 5.3.1>

a. (*)give patient number and date of birth of patients with blood type 'A' b. give all data of those patients born between 01-Jan-1980 and 31-Dec-1980;

c. (*)give patient number of the patients with blood type 'A' or blood type 'B' or blood type 'AB' or blood type 'O' with rhesus factor '+'.

<aggregate functions; Book section 5.3.3>

d. (*)calculate the total number of hospitalizations by specialist with snr = 10; e. calculate the total number of patients hospitalized by specialist with snr = 10; f. (*)calculate the longest hospitalization period that has ever occurred;

g. (*)calculate the longest and the shortest hospitalization period that has ever occurred; <multi-table queries; Book section 5.3.7>

h. give patient number, patient name and specialist number of the patients hospitalized on 12-Feb-1998; <grouping; Book section 5.3.4>

i. give for each patient the patient number and the number of hospitalizations for this patient; j. give for each patient who has been hospitalized more than two times, the patient number and the

duration of the longest hospitalization;

k. (*)give for each patient, who has been hospitalized more than 2 times by specialist with snr = 10 and of which the longest hospitalization period is at least 3 times the shortest hospitalization period, the patient number and the duration of the longest and the shortest hospitalization period for this specialist;

First describe the following questions using the tuple relational algebra, then describe the questions in SQL:

<subqueries without synchronization; Book section 5.3.5>

l. give for each patient with blood type 'A' who is hospitalized by specialist with snr = 10, the name of the patient;

m. give the patient number and the hospitalization reason for those hospitalizations of which the hospitalization period is longer than the average over all hospitalization periods;

(22)

n. give the patient number and the hospitalization reason for those hospitalizations of which the hospitalization period is longer than the average over all hospitalization periods with the same hospitalization reason;

o. (*)give the number and the name of the patients who have been hospitalized more than 5 times by specialist with snr = 10;

Describe the following questions directly in SQL:

<mixed exercises; Book chapter 5>

p. (*)give the patient number and the number of hospitalizations of those patients who have been hospitalized more than 5 times in which each hospitalization period is at most 5 (days);

q. give the patient number, the patient name and the average hospitalization period of the hospitalizations of those patients who have been hospitalized by specialist with snr = 10;

r. give the patient number of the patients with the greatest number of hospitalizations of all patients;

First describe the following questions using the tuple relational algebra, then describe the questions in SQL:

<subqueries using quantifiers; Book sections 5.3.8 and 5.3.10>

s. give specialist number and specialist name of the specialists who have never hospitalized a patient; t. give specialist number and specialist name of the specialists who have hospitalized all patients who have

hospitalization reason ‘new hip’;

u. give specialist number and specialist name of the specialists who have hospitalized all patients born before 1-Jan-1900;

v. give specialist number and specialist name of the specialists who have only hospitalized patients born before 1-Jan-1900;

w. (*)give specialist number and specialist name of the specialists who have only hospitalized all patients born before 1-Jan-1900;

x. (*)give specialist number and specialist name of the specialists who have only hospitalized all patients born before 1-Jan-1900 who have hospitalization reason ‘new hip’;

Describe the following mutations directly in SQL:

y. Reduce the number of beds of the specialists who have hospitalized less than 10 patients in 2001, with 20%. z. (*)Remove the hospitalization tuple regarding patient 6883 in room 27 on 20-Jun-1977.

(23)

10 Solutions to selected exercises

Exercise 4.2

a. false

explanation: take x = 1 and y = 1 b. true

explanation: take x = 9 and y = 1..9 c. true

explanation: take y = 1..9 and y = 9 d. false

explanation: take x = 1 and y = 1

Exercise 4.4

a. ( x )( x  N  ( y )( y  N  x  y  x + y = 3 )) b. ( x )( x  N  ( y )( y  N  ( z )( z  N  x  y  x  z  y  z  x + y + z = 6 ))) c. ( x )( x  N  x2  0) d. the same as c.

Exercise 5.1

a. { SUM (x) | x  A  x > 0 } > 100 b. { MAX (x) | x  A  x > 0 } = 75 c. { COUNT (x) | x  A  x > 0 } > { COUNT (x) | x  A  x < 0 } d. X = { COUNT (x) | x  A } e. X = { MAX (x) | x  A  x > 0 } > 2 * { MIN (x) | x  A  x > 0 }

Exercise 6.1

MEMBER (memberNr: [0..999999] name: char(15), initials: char(5),

(24)

street: char(20), city: char(20) ); FINES (fineNr: [0..99999999], memberNr: [0..999999], dateIn: [01-Jan-1980..31-Dec-1999],

dateOut: [01-Jan-1988..31-Dec-1999] or <null>, amount: number(7,2)

);

constraints primary key:

memberNr uniquely identifies MEMBERS fineNr uniquely identifies FINES

foreign key:

FINES.memberNr is foreign key referencing MEMBER.memberNr

Exercise 6.2

c. table;

( t1,t2 )( t1,t2  ORDER  ((t1.customernr = t2.customernr)  (t1.orderdate = t2.orderdate)

 t1=t2 ))

remark1

description of an alternate key: implementation: NOT NULL and UNIQUE

remark2

compare the last part with:

(t1. customernr = t2. customernr)  (t1  t2)  (t1. orderdate  t2. orderdate) because of the following equations:

a  b



c  ~ (a  b)  c  ~ a  ~ b  c a  ~ c

 ~

b  ~ (a  ~c)  ~b  ~ a  ~ b  c f. database;

( x )( x  CUSTOMER  (x.city = 'Amsterdam') 

( y )( y  ORDER  (y.customenr=x.customernr)  ( z )( z  ORDERLINE  (z.ordernr=y.ordernr) 

~ ( w )( w  ARTICLE  (w.articlenr=z.articlenr)  (w.color='purple') ))))

rewriting yields:

~ ( x )( x  CUSTOMER  (x.city = 'Amsterdam')  ( y )( x  ORDER  (y. customenr =x. customenr)  ( z )( x ORDERLINE  (z.ordernr=y.ordernr) 

(25)

h. database;

( t )( t  CUSTOMER  (t.credit >

{ SUM (y.amount * z.price) | x  ORDER  y  ORDERLINE  z  ARTICLE

 (x.customernr=t.customernr)  (x.ordernr=y.ordernr)  (y.articlenr=z.articlenr) } )

Exercise 8.1

e. { t.naam | t  CUSTOMER 

( w )( w  ORDER  (w.customernr=t.customernr)  w.orderdate > '01-Jan-1991'  ( x )( x  DEEL-ORDER  (x.ordernr=w.ordernr) 

( y )( y  ARTICLE  (y.articlenr=x.articlenr)  (y.color=’red’) ) ) )

}

remark

there is an alternative using the join operator:

{ t.name | t  (((CUSTOMER join ORDER) join ORDERLINE) join ARTICLE)  (t.orderdate > '01-Jan-1991'  (t.color='red')

}

f. { t.name | t  CUSTOMER 

( w )( w  ORDER  (w.customernr = t.customernr)  ( x )( x  ORDERLINE  (x.ordernr=w.ordernr) 

~ ( y )( y  ARTICLE  (y.articlenr=x.articlenr)  (y.price > 100) ) ) )

}

rewriting yields:

{ t.name | t  CUSTOMER 

( w )( w  ORDER (w.customernr = t.customernr)  ( x )( x  ORDERLINE (x.ordernr=w.ordernr) 

( y )( y  ARTICLE  (y.articlenr=x.articlenr) (y.price > 100) ) ) )

}

h. { t.articlenr | t  ARTICLE 

( w )( w  ORDERLINE (w.articlenr = t. articlenr)  ~ ( x )( x  ORDER (x.ordernr = w.ordernr) 

( y )( y  CUSTOMER  (y. customernr = x. customernr)  (y.city='AMSTERDAM')

))) }

(26)

~ ( y )( y  CUSTOMER  (y. customernr = x. customernr)  (y.city='AMSTERDAM')

))) }

This expression answers the question: give the article numbers of articles that have been ordered by customers (ORDER!) who do not live in Amsterdam. The question was, however, give the articles that

were not ordered (ORDER!) by customers from Amsterdam! rewriting yields:

{ t. articlenr | t  ARTICLE 

~ ( w )( w  ORDERLINE (w.articlenr = t. articlenr)  ( x )( x  ORDER (x.ordernr = w.ordernr) 

( y )( y  CUSTOMER  (y. customernr = x. customernr)  (y.city='AMSTERDAM')

)))

}

i. { t. articlenr | t  ARTIKEL 

( w )( w  ORDERLINE (w. articlenr =t. articlenr)  ( x )( x  ORDER (x.ordernr=w.ordernr) 

( y )( y  CUSTOMER  (y. customernr =x. customernr)  (y.city = 'AMSTERDAM') ))) }

Exercise 9.1

a. SELECT PNR,DOB FROM P WHERE BLTP = 'A'; c. SELECT PNR FROM P

WHERE BLTP = 'A' /* OR BLTP IN ('A','B','AB') OR BLTP = 'B' OR BLTP = 'AB' OR (BLTP = 'O' AND RHF = '+'); d. SELECT COUNT(*) FROM HOSP WHERE SNR = 10; f.

SELECT MAX(DATEOUT- DATEIN) FROM HOSP;

g.

(27)

FROM HOSP; k.

SELECT PNR, MAX(DATEOUT- DATEIN), MIN(DATEOUT- DATEIN) FROM HOSP

WHERE SNR = 10 GROUP BY PNR

HAVING COUNT(*) > 2

AND MAX(DATEOUT- DATEIN) ≥ 3 * MIN(DATEOUT- DATEIN); o. { t.(PNR,PNM) | t  P  { COUNT (w) | w  OPN  (w.PNR=t.PNR)  (w.SNR = 10) } > 5 } SELECT PNR, PNM FROM P WHERE 5 < (SELECT COUNT(*) FROM HOSP WHERE P.PNR = HOSP.PNR AND SNR = 10); p. SELECT P.PNR, COUNT(*) FROM P, HOSP WHERE P.PNR = HOSP.PNR GROUP BY P.PNR

HAVING MAX(DATEOUT- DATEIN)  5 AND COUNT(*) > 5; w. combination of 21 and 22: { t.(SNR,SNM) | t  SP  (



w )( w  P  (w.dob <= '01-Jan-1900')



(



x )( x  HOSP  (x.PNR = w.PNR) (x.SNR = t.SNR))) AND (



w )( w  HOSP  (w.SNR = t.SNR)



(



x )( x  P  (x.PNR = w.PNR) (x.dob <= '01-Jan-1900'))) } x. { t.(SNR,SNM) : t  SP 

(



w )( w  (P join HOSP) (w.dob <= '01-Jan-1900')  (w.hrsn = 'new hip')



(



x )( x  HOSP  (x.PNR = w.PNR) (x.SNR = t.SNR))) 

(



w )( w  (HOSP join P) (w.SNR = t.SNR)  (w.hrsn = 'new hip')



(



x )( x  P  (x.PNR = w.PNR) (x.dob <= '01-Jan-1900'))) } z. DELETE FROM HOSP WHERE PNR= 6883

Mathematics for Databases

Fontys Information Technology