Clone

HTTPS: git clone https://vervis.peers.community/repos/yEzqv

SSH: git clone USERNAME@vervis.peers.community:yEzqv

Branches

master

examples.mdwn

Purpose

Explain why a network layer is needed.

Content

So far, before reaching the network level in the design process of the expression, the examples and ideas were using a single program on a user’s computer, managing a single semantic database. Named graphs were mentioned, which allow several databases to be managed by a single interface, but nothing was said as to how is it done.

Before explaining how, we must understand why. So why is a network layer needed? Here are examples:

Local Cooperation

Assume John Doe has a computer. The computer contains three databases:

File system database, managed by special software optimized for file system details
Desktop resource database, managing information of documents, music, videos, tasks, calendars, e-mail and pictures
Research wiki database, managing John’s research project about Semantic Desktop

The database contents are related semantically, e.g. a page in the research wiki is a file in the file system. As a result, execution of some queries requires that data from more than one database is used. Here is a query which can use a single database:

From graph: Desktop Resources
Give me all: title(?song)
Where:
	?song was written in the 60s
	?song contains the word "love"

Now let’s see another query, which requires access to two graphs:

From graphs: Desktop Resources, Research Wiki
Give me all: title(?page)
Where:
	?page containsTask ?task
	due-date(?task) > 2014-01-31

Now we simply have no choice. A possible implementation of the query would be going through all wiki pages, finding the tasks mentioned, and taking just the tasks with a due date matching the condition. Or we could go over all tasks and check both the date condition and whether the task is mentioned in any wiki page. Either way, we must access both databases.

Peer Cooperation

This is a similar concept, where the two databases used reside on different machines. For example, Bob and John work together on a research project. We’d like to get a list of research tasks which Bob has created, assigned to John, and John hasn’t started yet.

Network Cooperation

This kind of query requires, or may require, more than one machine’s cooperation. It is a sort of graph traversal: It starts from a single machine and traverses the network, sending intermediate results back to the source. For example, “give me a list of all the people with a”knows" degree of up to 3 in respect to myself".

What’s a degree of a property? Assume a property “knows” which links a person to someone she knows. This property is a subproperty of a more general property called “knowsMaybeIndirectly”. For example:

Alice knows Bob
Bob knows Cindy

From these statements we can deduce that Alice knowsMaybeIndirectly Cindy, because she knows her through Bob. In respect to the “knows” property, we can give each statement using knowsMaybeIndirectly a degree:

Degree 0 means knowing myself, e.g. Alice knows Alice
Degree 1 means knowing directly, e.g. Alice knows Bob
Degree 2 means knowing through one person, like in the example above
Degree 3 would occur in “Alice knowsMaybeIndirectly Dan” if we added the statement “Cindy knows Dan”
And so on…

Thus, the query means “give me all the people I know through a chain or two people at most”, e.g. Bob and Cindy in the example of degree 3.

Now, how do we execute the query? We start from my machine, and using my database we make a list of people I know. Now, for each person in that list, we find in my database the details required for communicating with her machine and send her machine a query: Get me a list of the people you knowMaybeIndirectly in a degree of 2 at most. Each such machine does the same thing we did, but sending its queries further with degree 1. The machines receiving that query create a local list using their local database, and return the result to the caller. Then the chain of queries works backwards and returns the results I asked for.

Global Cooperation

This kind of operation involves a large number of machines, possibly many servers from many parts of the world. Such a query is not expected to be used on the fly by users while browsing the web or running personal desktop servers, but it can be very useful for research and for data gathering processes which are continuous or don’t need to return results immediately.

For example: Give me a list of all programming languages in the world. How could this be done? Every single database connected to the network would have to be queried. The process would take some time, but eventually we would get the list and publish it on the web with popularity statistics for the benefit on mankind.

How does it work? As always, it starts from a single machine. Now the machine distributes it further to all the machines it knows, and they dustribute to all the machines they know, and so on. However, since each machine may be contacted and queried many many times in the process, each machine - like in graph algorithms - would hold a flag stating whether it has already been visited. Each global query would have a unique identifier, similar to resources uids, and it would be used by a machine to check if it already answered the query.

[See repo JSON]

Clone

Branches

Tags

examples.mdwn

Purpose

Content