• No results found

CHAPTER 3 THEORY OF KNOWLEDGE GRAPHS

3.1 F ORMAL D ESCRIPTION OF K NOWLEDGE G RAPHS

3.1.2 Basic relations

To describe the real world, we need to distinguish the relationships between tokens. In knowledge graph theory, the most important principle is to use a very limited set of relationships. These relation types are required to be basic. The more independent these types are semantically, the better. It should not be possible to deduce one relation type from others. The meaning of these relations can be described through considering the relationship with the real world. These basic relation types can be used for establishing more complex concepts and relations. How to choose these basic relations in knowledge graph theory?

A basic relation is the relation of cause (CAU). In the initial stage of the knowledge graph project, this relation was the only relation type that people were interested in. In fact, in medical and social science, this one plays a very important role.

Definition 3.3 The causal relation between two tokens is expressed by the arc labeled CAU.

This relation expresses the relationship between a cause and an effect, or a thing influencing another thing. This relation type, not only in knowledge graph theory but also in other representation methods, is the relation that was distinguished most early, and is a basic relation which is used in a lot of inferences as occurring in a diagnosis system. The famous expert system MYCIN concerns IF THEN-rules only based on this one type. This relation points from the concept that produces the influence to the concept that has been affected, in the graphic representation.

CAU CAU

John EQU

ALI

person

ALI

hit .

There are also various situations in which the causal relation occurs. A thing or person can arouse an incident or course. An incident or course can also arouse another incident or course. An incident or course can arouse a state. Therefore, the complex structure that contains the causal relation is used to describe complex concepts, such as agency, purpose, reason, tool and result. For instance, in English the phrase “to hit with a stick” can be described as “to cause the stick to move resulting in a contact state”.

When we discuss set theory, considering the relations between sets, we discover that four should be basic relations due to the following. Given “A” and “B” to express two sets respectively, we can distinguish the following relations: A= , B A⊂ ,B

φ

∩ B

A , as well as A∩ B=φ. The four relations show respectively: A and B are identical, A is a subset of B, A and B have common elements and A and B are completely disjoint. If we will regard A and B as the property sets of some special designated things, we must introduce relations to express these four relation types.

Definition 3.4 A mark is a value, if it is connected to a token through a directed EQU-relation. An EQU-relation between two tokens expresses that two sets are equal.

In graphic representation, this relation can express the concept naming through the arc from label to concept. This relation also can be used for thing assignment, for instance, red as value assigned to color. For a symmetrical relation, such as the equaling relation in set theory, we use the symmetric EQU-relation to join A and B.

Special values are the perceptions by a person as individual, therefore, a mark is a very special value, and this special value can be expressed by an EQU-relation, this EQU-relation being directed from the mark that expresses the value to the token that indicates the perception. The reason for using the EQU-relation is that the special value is the assignment for the studied token.

A very important relation is that a thing is a part of another thing.

Definition 3.5 If there are two tokens that express two sets respectively, and one is a subset of another, then there is a SUB-relation between the two tokens.

Note that there is a subtle difference between the SUB-relationship and the ISA-relationship mentioned before. For a SUB b, there are two different interpretations.

• Concept a is a part of concept b. For instance, tail SUB cat. This expresses that the tail of a cat can be regarded as a part of the cat because the molecules of the tail form a subset of the molecules of the cat.

• Concept a is more general than concept b, therefore, concept b contains at least all features of concept a. For instance, “mammal SUB cat” expresses that cat is a kind of mammal and has more information than that involved in the general mammal.

Note that we use a concept as a set of properties here. If the concept is expressed as a graph the elements of this graph will be said to be in an FPAR-relation with the graph as a whole. In the second interpretation of a SUB→ b, that in which a and b are seen as property sets, we have a contamination of terminology. As sets a and b are related by the SUB-relation, but as concepts, represented by graphs, we prefer to say that a is a property of b, as are all other subgraphs of the graph representing b. For this relationship between a and b we used the FPAR-relation for its description.

Definition 3.6 The ALI-relation is used between two tokens for which there exist common elements.

Definition 3.7 The DIS-relation is used to express that two tokens are in no relation to each other.

In set theory, A DIS B expresses A∩B=φ. Because of the symmetry, the DIS-relation is described by an edge instead of an arc. The same holds for the EQU and the ALI relation.

When humans think about something they judge and ascribe certain attributes to things. For example, “the ball is red” indicates a relation like “red is the color of the ball”. This led to including a new relation between an attribute and an entity.

Definition 3.8 The PAR-relation expresses that something is an attribute of something else.

This relation expresses that a certain thing is attributed to or the external nature of another thing. In graphic representation, this relation is from the attribute concept to the entity concept.

Another relation that needs to be considered is the order relation. This relation expresses ordering with respect to a certain scale, like space, time, place, length,

temperature, weight, age, etc. With this relation it is also possible to represent different tenses of language, by relating the time of an event with the time of speaking, see Chapter 6. In our concept world, this relation is a basic type of relation.

Definition 3.9 The ORD-relation expresses that two things have a certain ordering with respect to each other.

When comparing the order of two things, we use this relation. This relation is usually used for showing the order of time and space; but it also can be used to express “<”

relation in mathematics. When considering an ordering relation, the ORD-arc usually points from the token with “low” value of the concept to the token with “high” value of the concept.

The basic goal of the knowledge graph project is to use a limited number of relation types. Only if the relation types are not enough to express something, we will be forced to add a new relation type. To express the dependency relation, we must consider a new relation type, which corresponds to mappings. Though in natural language there are many words to express mapping, we still choose one relation type to express this relation, which is called SKO (Skolem)-relation. We particularly refer to van den Berg [Berg, 1993]. For informational dependency in mathematics we use the words function, functional or just mapping. In natural language words like

“depends on” are used.

Definition 3.10 A token in a knowledge graph has an incoming SKO-relation from another token if it is informationally dependent on that token.

The meaning of the SKO-relation is based on the concept of informational dependency. This involves an aspect of choice, see also Section 6.6.

In information transmission, we discover the mutual connection between information, as one of the basic relations that we must consider. It expresses a relation as between

“saying” and “what is said”. A change in the “saying” causes a change in “what is said”, but “what is said” is informationally dependent on the “saying” process. On the syntactic level we encounter a similar situation. In “man hits dog”, we choose to relate

“man” with “hit” and “hit” with “dog” by a CAU-relation. But what is the type of relation between the subject and the verb, respectively the verb and the object? That something is an object or a subject depends on its functional relationship with the verb.

For that reason these syntactic relationships are also modeled by the SKO-relation.

Of course, knowing, perception, feeling, may also be modeled with this relation.

Apparently, it is impossible to express everything in the world with only binary relations. To solve this problem, in the first stage of the knowledge graph project, the frame relation FPAR was introduced. In the second stage of the knowledge graph project, people were led to the three other frame relations. At present in knowledge graph theory there are four frame relations in total, see [Reidsma, 2001].

In fact, the FPAR-relation is the initial frame type, which is used to express a complex concept or to express the word “and” in logic.

Definition 3.11 A frame is a labeled node. A frame relation expresses that the labeled node is actually a frame around some complex graph. All nodes and arcs within the frame are connected to the frame node by the FPAR-relation.

Note that the graph can be interpreted as an n-ary relationship, just like an arc can be interpreted as a binary relationship. The FPAR relationship expresses that some subgraph of the graph is part of the whole graph that was formed. The “animal” graph, itself a frame, is part of the “cat” graph. Hence, animal  FPAR→ cat. We already discussed the possibility to use a SUB-relationship here.

To express negation, and the possibility and the necessity in modal logic, three kinds of relation types are introduced.

Definition 3.12 NEGPAR expresses the negation of the contents of the frame.

Definition 3.13 POSPAR expresses the possibility of the contents of the frame.

Definition 3.14 NECPAR expresses the necessity of the contents of the frame.

Note that the contents of the frame may form the graph representation of a proposition.