Mirror of the Rel4tion website/wiki source, view at <http://rel4tion.org>

[[ 🗃 ^yEzqv rel4tion-wiki ]] :: [📥 Inbox] [📤 Outbox] [🐤 Followers] [🤝 Collaborators] [🛠 Commits]

Clone

HTTPS: git clone https://vervis.peers.community/repos/yEzqv

SSH: git clone USERNAME@vervis.peers.community:yEzqv

Branches

Tags

master :: projects / smaoin /

theory.mdwn

This page explains the theoretical concepts in which [[Smaoin]] has roots. The content is written in a simple accessible way especially for Smaoin, and is therefore not necessarily a good general-purpose intro to these topics, because it omits irrelevant and less important things. For more info you can read in [[!wikipedia Set_theory desc=Wikipedia]].

[[!toc]]

Motivation

Assume we’d like to represent information somehow in computers. Create a “model” that both humans and computers can understand. A model that all applications use, so computers and people can e.g. read and edit notes without installing the specific note taking application they were created with (although they can, to get a nice GUI) or read the content of their e-mail inbox without installing a specific e-mail client (but using one would provide a nice GUI). Make the information separate from the functionality. Applications then become just user interfaces, and data is a separate layer they much less worry about. You can easily switch between e-mail clients and you can send a note you wrote to a friend, who will be able to read it with whatever note taking application they use, or even without having one at all.

But what is information? Clearly we have bits and bytes which can encode any data, but how do we actually express useful things with them? We have numbers, text, symbols, ideas, images, music, animation, graphs, tables… what do they all have in common?

Information is already encoded in computers, of course. These very words are information, in plain text form. But that means computers don’t easily understand what I’m writing here. They just see bytes and characters and lines, but not meaning. How does software usually add meaning? Ideas:

There are so many formats and schemas, but something is still missing. Formats are often made for the specific use that motivated their creation. There is no integration between them, and conversion between formats requires dedicated software to be written. They use structures that make their usage fast, at the cost of incompatibility with other structures and other formats.

Information

What if there could be some “one format to unify them all”? A general purpose model, that even if not the most efficient and convenient for a specific use, can help bridge over all the differences and provide integration?

Here’s a suggested simple model for representing information. This is the idea RDF was (and is) based on, and Smaoin too is based on.

Suggestion: Every piece of information can be represented as a relation between some entities. Examples:

“Anne loves John.”

“The bus is green.”

“Patrick the cat is very cute”.

These are very simple sentences, of course, and more complex sentences can’t be represented like this. But we can combine several of these entities+relation entries to express bigger things.

Elements and Sets

Assume there are two kinds of things in the world, elements and sets:

Now we can simply even more our definition of information. Suggestion: any piece of information can be expressed as an answer to a question of the form “is element E a member of set S”. This is a yes/no question, whose answer can be represented in a single binary digit (bit), and many such answers can be combined into complex expressions of information.

Elements are generally denoted by lowercase letters, e.g. a, b, c, x, y, z. Sets are generally denoted by uppercase letters, e.g. A, B, C, X, Y, Z.

In order to denote that an element belongs to a set, we’ll use a relation called “is member of”. For example, 5 is-member-of Numbers and Joe is-member-of People. There is also the “is not member of” relation. We’ll denote them using symbols as follows:

Examples:

With this simple model in mind, we’d like to tackle the task: how to organize elements, sets and yes/no answers in the computer? We know we can use bits for the yes/no answers, but what about elements and sets? We could refer to elements by their name, but how do we encode definitions of sets?

First, let’s define some sets:

There are several interesting things we can see here:

Since we now have 2 ways to define sets, let’s use symbols to denote them conveniently.

For the first way, called extensional definition, we’ll list the members separated by commas and surround the list with curly brackets. Examples:

For the second way, called intensional definition, we’ll use curly brackets too. Inside, there will be two sections, separated by : or |. The first section is a template, and the second section describes how to generate memebers from the template, e.g. by specifying a condition or a trait. Examples:

Great! Now, how do we encode information with these tools? Let’s try.

“Anne loves John.”

Another option:

Also this:

And this:

All of them basically make sense. Is any one them the “right” one“? Is any of them better than another? Well, we can start by looking at the first two options. Both of them are very specific to the information we want to encode, and probably aren’t good for a general purpose model. We’d like to be able to refer to”love" without specifying specific people.

Options 3 and 4 sound somewhat general purpose. But both of them define their set using pairs or triples of elements, which we haven’t seen yet. Let’s define these things first.

Pairs and Tuples

In sets there is no order between the elements. For example:

P = The set of all the people in the world = {p | p is a person}

Does any specific person come “before” or “after” any other person? If you order them by name alphabetically then yes, otherwise no. The same is true for numbers: They have an order mathematically, but as far as the set is concerned, there is no order. Therefore:

Sometimes, like with numbers, we do want to express order. For example, “Anne loves John” and “John loves Anne” are not the same thing (although I hope both are true). So the order matters, i.e. whether Anne is first and John is second, or John is first and Anne is second.

Let’s introduce order, then. We’ll start with pairs.

A pair is… simply a pair or elements. One is the first element and the other is the second element. The order matters, so the pair Anne-John and the pair John-Anne are different pairs, while {Anne, John} and {John, Anne} are the same set.

Pairs are denoted using triangle brackets. For example:

<Anne, John>
<John, Anne>
<x, y>

The same element can be used for both sides of the pair, and sets can be pair elements too:

<x, x>
<A, B>
<X, X>

Pairs can be pair elements too:

<<x, y>, z>
<Anne, <John, Tom>>
<<Alice, Bob>, <Cindy, Dan>>
<<<a,b>, <c, d>>, <<e, f>, <g, h>>>

Pairs can be set members:

<1, 2> ∈ {<x, y> | x and y are numbers}

Now let’s extend the pair concept into arbitrary length. What if, instead of 2 elements, we put 3? Like this:

<x, y, z>

Pairs are a special case of a tuple. A tuple is a series of elments with ordered positions. A tuple of length 2 is a pair. Any lengths are possible:

<a, b>
<a, b, c>
<a, b, c, d>
<a, b, c, d, e>
<1, 2, 3, ... 1000>

Like pairs, tuples can be set members, pair and tuple members and so on.

Relations

Finally we can approach the last tool we need for our information model. We already saw we can just use pairs and tuples, like this:

“Anne loves John.”

This is how relations work. They aren’t a new building block: A relation is a set of tuples of the same length. Let’s assume for now that these tuples are of length 2, i.e. assume a relation is a set of pairs.

In other words, our Set+Member representation above actually uses a relation! It defines a relation using an intensional definition, and then specifies a pair of elements which is a member of the relation (remember a relation is a set).

Instead of “is member of” between pairs and relations, we’ll use “holds for”. If a pair <x, y> is a member of relation R, we say “R holds for x and y”. Symbolic notations:

<x, y> ∈ R
or
xRy

The second notation may seem strange but it makes sense with relation symbols. For example, the relation “is greater than”. “5 is greater than 3” will be denoted as:

<5, 3> ∈ >
or
5 > 3

Relations which contains pairs are called binary relations. The length of the member tuples, in this case 2, is called the arity of the relation. Larger arities are possible, e.g. a relation can be a set of triples (3) or a set of quadruples (4) and so on.

Information Revisited

Let’s go back to our options 3 and 4 we had above, which use pairs:

“Anne loves John.”

Or:

As you can see, they both use relations and are basically equivalent. The difference is that the first case has a single set for all information, while the second case has a separate set for each relation. Relational databases (SQL) use the second option: Each table represents a relation. Semantic databases (RDF, Smaoin) use the first option: There is a single table containing element-relation-element triples.

TODO: Queries

[[!template id=todo text=“Explain theory of propositional and predicate calculus”]]

[See repo JSON]