Mirror of the Rel4tion website/wiki source, view at <http://rel4tion.org>
Clone
HTTPS:
git clone https://vervis.peers.community/repos/yEzqv
SSH:
git clone USERNAME@vervis.peers.community:yEzqv
Branches
Tags
saugus.mdwn
Saugus was until recently planned and designed where [[Dilosi]] is now, so Dilosi’s pages may provide more info (which should be moved here).
Saugus is a semantic datastore. It’s the functional heart of Razom. Or perhaps the brain. It’s expected to work like most databases: B+ trees, indexing, memory management, access control and so on. But unlike most databases, Saugus is not meant to be used on heavy servers. The simple implication is that while Saugus may scale, it’s not guaranteed. Saugus is made to work efficiently for desktop users who are also network peers in a P2P semantic node network. Huge server scale terabyte databases with thousands of connections per minute are not considered.
- [[architecture.dia]]
Another thing is the access control: Instead of using a serious encryption based authentication mechanism, Saugus may employ a simple test which relies on the client application to provide correct information. The access control is meant mainly for helping avoid accidental damage by defining which data can be read and written by each application. Security, anonymity and so on are a bonus which won’t necessarily be built into Saugus (but then there will be other ways to have these things, so don’t worry).
Programming language and implementation plan:
My initial idea was, and still is, to base my code on the way 4store works. It’s a native dedicated triplestore. I can remove the cluster support and any overhead caused by server specific things, and have a simple fast desktop triplestore.
In parallel I want to read and understand how databases work internally. Some info to start with:
- http://stackoverflow.com/questions/172925/how-do-databases-work-internally
- [[!wikipedia Triplestore]]
- [[!wikipedia Database]]
- [[!wikipedia “B+ Tree”]]
The plan was to use SGP, Sif and Tosaf as C++ tools to make Dilosi, then write Saugus as a C++ library and make a plugin to connect them. It was an all-C++ plan, where higher level languages, especially Python, are convenient wrappers for user applications to use.
Then things changed a bit, when I started to think about using Vala and/or Python for the graphical parts, instead of gtkmm. The idea of Python was cool in theory but now I actually put it into my todo list, and I realized there would be no cool gtkmm code but the different things that those languages provide. I asked the question, why use C++ in the first place?
Then I started to think about functional programming. The whole domain of databases, while providing access to persistent data, is actually functional in the backend - it’s all algorithms and information handling. There’s no sophisticated object model other than what serves the database daemon. And the daemon is just a program that waits for input, and then executes a function and returns its output. It’s functional nature.
And I started liking it in general. Imperative code creates a lot of unnecessary time dependencies between statements, even if they can be executed in paralle. Notations for helper variables aren’t just symbolic - they actually tell the compiler how do compile things into the binary. It’s in a sense a bit too much into the bare metal. Even high-level imperative-supporting languages are like that.
So I had a new idea: Finish SGP, Sif and Tosaf. But at the same time, read about functional languages and choose a fast compiled one. Options known to me that are worth checking:
- Haskell
- Mercury (logical language with functional support)
- OCaml
- Scheme/LISP, using a compiler and not a real-time interpreter
- Rust
The general direction is functional programming but with some sort of object oriented abstraction. Scheme can use pairs to build data structures, but maybe other languages have built-in tools for this.
Anyway, I want to follow 4store. If I start early enough, I’ll do it in C++ and move later if I choose to.
Here’s some text I wrote in October when I first opened the 4store source code.
From 4store to Saugus
Here I’ll be documenting my progress of creating libsaugus from the source code of 4store. I decided not to depend on external backends:
- SQL databases have unnecessary overhead and not optimized for triples/quads
- Berkeley DB could work but I don’t feel comfortable with an Oracle depedency considering all the license and software freedom issues. And again, it’s not optimized for triples/quads. And it has high memory usage.
4store has more components than what I really need. I’m going to somehow take the backend and frontend code and make a single library. Maybe non-core things will become their own libraries.
Since there is no documentation of the code structure (unless I missed it), I’ll start building my code bottom-up. I’ll use the following git repos:
- 4store : clone of the 4store I have on the Partager server
- 4store-ripped : 4store from which I remove parts to help me track the progress
- 4store-fr33-lib : the code I’m writing
Method: Take .h files and do basic conversion to .hpp files. Then remove both the .h and corresponding .c from 4store-ripped.
Tasks and Ideas
- Is Content Addressable Storage a relevant concept? Imagine you use the DB as a file system but can find things fast not by structure (since there’s none, no hierarchy/tree) but by hashing the entities/relations/statements