Mirror of the Rel4tion website/wiki source, view at <http://rel4tion.org>

[[ 🗃 ^yEzqv rel4tion-wiki ]] :: [📥 Inbox] [📤 Outbox] [🐤 Followers] [🤝 Collaborators] [🛠 Commits]

Clone

HTTPS: git clone https://vervis.peers.community/repos/yEzqv

SSH: git clone USERNAME@vervis.peers.community:yEzqv

Branches

Tags

master :: people / fr33domlover /

decent.mdwn

[[!meta title=“Thoughts on Decentralized and Distributed Systems”]]

Recently (2015-05-29) a blog post was published Chris Ball, which introduces his work on a “decentralized github”. To be more precise here, this work is about a decentralized and distributed development platform, with the first focus being distributed storage of and access to git repositories.

When I read it, I thought about my ideas for a decentralized development platform. I was working in quite a different direction: Have servers just like now, but allow operations (such as merge requests and team management) to be done across servers, making them a single worldwide network. You could say my idea was in the application logic level, while distributed access is in the protocol level. And I asked myself, was I wrong? Should I aim at distributed storage too, and not waste time on “just decentralized”?

I thought about it a bit more since then, and I came up with some initial ideas. First of all, imagine a development platform that runs on a single server. Totally centralized. Now, suppose the team decides to run it on two separate servers now, and have them federate. For the beginning, they have their own user and repository databases, but merge requests and bug reports can be done across servers.

Later, the team decides to add more servers. Three, four five… one day there are ten servers. People choose a server randomly, or depending on geographic location, or maybe they know the admin, or they just like the design or color scheme of some server’s web interface.

The team doesn’t stop here. They encourage people to install server instances, more and more, until there are hundreds of them. One day, there are thousands. Some time later, tens of thousands. In every project, at least one developer runs their own instance, and that’s where the project is hosted. Suddenly, something changes. There is a large amount of very small servers, just few users in each. Small home servers are less reliable, don’t always have backup, can’t handle high loads that some popular projects may require (frequent clones, large repos or both) and sometimes need to go offline for a while.

This may be where distributed systems come into the picture. They allow to share the load! Got offline? No problem, the other team members can still access the repo and keep working, because your home server doesn’t alone handle the storage and commit access. Slow connection? No problem, commmits can download from several peers simultaneously, some of which may be faster.

Suppose all those tens of thousands of servers are all fast and reliable. Is there a reason to move to a distributed system? Let’s keep going. More instances continue to be launched. Now every user of the system has their own instance at home. Every single user. For multi-user projects, one instance is the host and it refers to other instances for project related information and operations.

Basic project operations:

Pulling works just through this one instance. This may overload its upload bandwidth, since it has to handle alone all the users doing clones and occasional pulls.

Pushing works in a similar way, but there’s a small number of people pushing. It has to store the SSH keys though.


Hmmm I think I forgot where this was going. Jumping to some practical ideas.


Here’s a random idea. Have a distributed network, but allow each peer to be a server for direct connections. People who can’t run a peer just connect to an existing peer, using it as a regular server.

[See repo JSON]