Clone

HTTPS: git clone https://vervis.peers.community/repos/yEzqv

SSH: git clone USERNAME@vervis.peers.community:yEzqv

Branches

master

data-sharing.mdwn

Purpose

Explain how database content is shared between machines.

Content

As explained in the previous page, the guidelines for the sharing protocol are:

Decentralization
Distribution
Peer-to-peer
Federation

Another important goal: Minimize network operations and dependencies of machines on other machines through the network.

In general, the timing of data sharing may be:

get it from the network every time
sync it all the time and use local copy

Now let’s see what kinds of data passing can or need to be done.

Ontology: Sync it and have a local copy updated at all times
Frequently used data / high processing needs: sync same as above
One-time use: Downloaded once when needed, processed and removed
Infrequently used data: Either download on demand or sync.

Sync is preferred for:

Productivity: When the user wants it, it loads locally
Privacy: Network action doesn’t reflect usage pattern
Sending is coordinated, i.e. more safety for server: Instead of waiting for requests, just send synced data periodically
Later access even without network connection
Mesh: pass it on to other machines ==> distributed sharing

Three data levels:

Personal: Created and managed locally
Collaborative: Managed by a team and synced to local machines
Global: Any other data, usually it’s downloaded on demand

Examples:

Personal: E-mail, reminders, diary, personal tasks, pictures
Collaborative: Ontologies, git repo, project wiki, bug database
Global: Other users’ photo albums, websites, large databases

Update model: - Desktop: User runs update, downloading new software versions from repo - Web: Download webpage again and again every time

Hmmm… I need ideas, examples. Remote data which can be a semantic database:

Song database
Movie database
Collaborative research wiki
Family photo album
Recipe sharing website
Events & calendar

Song database: There are many many songs, and the whole database may be too big to keep on a single machine. What you need as a client is just the portion describing the songs you are interested in, or maybe just the ones you have as OGG files on your computer. It’s not supposed to change much, so you don’t need to sync is periodically. Just download the new portions you need every time you download new song audio to your computer.

Movie database: Similar to song database. But for the purpose of the example, assume new info may be added about movies so it makes sense to check for updates once in a while, say once a month or once a week.

Collaborative research wiki: Two things happen: Download and upload. Download can happen by sync, but periodic sync is not good enough. It has to be either manual, or every time there’s a change, the machine gets a message and downloads the new version. Upload happens by sending updates back to where the data was downloaded.

Family photo album: Assuming no central server exists, every change needs to be propagated to everyone somehow. It can work like a torrent download, but how is everyone notified? An option is let the machines check for updates periodically, and download updates when found.

Recipe sharing website: There may be a huge number of recipes, but you need just a portion. Idea: They are spread among many machines, each recipe copied on many machines. So the user can browse them like a website, search and so on, and at the same time serve as a peer for sharing recipes, either all (like in Tor) or the ones chosen specifically (like Torrent).

Events & calendar: This has a personal part of course, but sometimes organizations etc. publish event calendars, and people can connect to them and see them in their desktop calendar. How to sync it? It depends on the specific calendar. It can be done periodically or let the server inform the client of changes. Adding changes is done locally and then the change is spread i.e. distributed and propagated to all clients.

ELEMENTS

Change: Is the data expected to change? How often? Does it describe past facts which aren’t supposed to ever change?
Sync trigger: Sync triggered by user manually? Or it’s done priodically?
Sync frequency: Is the sync done in a runtime-relevant frequency, e.g. once a minute/hour, or something larger like once a week/month?
Sync initiator: Does the server or the client initiate the sync?
Update initiation: Server sends data on update / periodically or client asks for data periodically / manually or server informs client of updates, and client downloads them later when it can

[See repo JSON]

Clone

Branches

Tags

data-sharing.mdwn

Purpose

Content