Mirror of the Rel4tion website/wiki source, view at <http://rel4tion.org>
Clone
HTTPS:
git clone https://vervis.peers.community/repos/yEzqv
SSH:
git clone USERNAME@vervis.peers.community:yEzqv
Branches
Tags
tasks-and-ideas.mdwn
- Content
- Block
- Text
- ComputerLanguageText (can be deduced, i.e. if a Content hasSyntax)
- ProgramCode (can be deduced as discussed) + Script (can be deduced, i.e. this is the intersection of ProgramCode and Executable)
- PythonCode
- Image
- RasterImage
- VectorGraphic
- Audio
- SoundWave
- SpectralComposition
- Video
- Executable
- ByteCode
- Binary
NOTES
- Interfaces
Model APIs, interfaces and implementations, especially so I can implement the storage backend interface. First read about D-Bus and learn from the experience gained while implementing it, and then create my own model. TODO
- Name
Since this is just the data model for now, re-defining everything about computing, I want the name to mean “genesis”. Start here: https://en.wiktionary.org/wiki/beginning done: the name suggested right now is Kiwi
- Permissions
Read about Linux file permissions, understand how it work, strengths, weaknesses, so I can add a universal extensible model to my wiki model. TODO
- Versions
Allow the document/file version vocabulary to specify draft and release versions TODO
- Data
It may be good to have a Data class under Content, for example for XML and YAML files containing data structures. It makes sense because XML is just a textual representation, while data can also be binary-encoded. So an XML is always Text but it’s also something else: For example it can encode a process or a document (still text of course) or an SVG image and so on. TODO
- Representation, Encoding and Syntax
The model is !!!flawed!!!. Take a LaTex file which represents an article. Both are text, so you put your object under Text. But then do you make it an Article? If you do, how does one tell whether the source is an article, or a generated form is an article? And how is it generated? IDEA: Use Content for the “generated” form, i.e. the semantics. For example, SVG in the Content sense is just an image.
Then you can add information about the actual syntax of the file and whether the data is specified in a textual form.
In other word, Content specifies which resources you have, e.g. a PDF and its LaTeX source both represent an article. It’s true the LaTeX is just a source, but in practice even viewing a PDF means rendering, so both require computation anyway.
I have 2 ideas:
Use the Content tree to represent mixed concepts, e.g. a LaTeX file is both plain-text and a LaTeX file and a book/article/document, and if all of them map to a file being Text that’s okay. It just means you’ll find it under your documents and under your plaintext and under your articles. Excellent. Then, allow block to represent something and to be encoded as something. For example, a LaTeX file is encoded as text, has LaTeX syntax and repesents e.g. a book.
Use Content as a deduced class, i.e. what you actually define is a Block and specify the encoding and what the Block represents and which processes transform it into other forms. Then it is deduced that a plaintext document is Text, and that a PDF file - even though PDF is binary-encoded - is Text, and so on. Actually, FIX: PDF is not always Text, e.g. a PDF can be just an image with no text at all. THINK ABOUT IT.
Okay, let’s try to model this better. A file has two things: Semantics and syntax. Semantics is what it represents (e.g. an HTML file represents a webpage), and syntax is how it expresses the represented data (e.g. HTML text).
However, note that LaTeX is general-purpose so it can be converted into many things: PDF, HTML, plain text… I want to have abstract concepts for what it represents, regardless of the specific format used. In other words it depends only on the LaTeX source, so it exists even if the source is never converted to anything else.
How, here’s something else: There is no strict binding to syntax or semantics!!! For example, what is the syntax of a PNG image? It’s a binary format, right? But it can still be viewed using a hex editor, meaning that every file is also a binary file. Now, this thing suggests something to us: While there are “how represented” and “what means” semantics, there is absolutely no difference to the computer between one form of rendering (e.g. render SVG as text in Gedit using the Pango library via GtkSourceView) and another (e.g. render the SVG as a vector graphic using librsvg).
I need to read about MIME types, but look:
- YAML is application/yaml
- XML is application/xml
- SVG is image/svg+xml
- RDF is application/rdf+xml
MIME types assign a single type for each file, so they try somehow to encode the double-face of XML files, as being both text-editable and renderable e.g. to image or PDF. I want my model to reflect both in a uniform way.
So here is some theory: Each File is a BinaryBlock, and each BinaryBlock is a Block. Each BinaryBlock can be edited using a hex editor, which means all files that aren’t fake, e.g. directories (they have inode/directory MIME type), have this facet regardless of anything else they mean.
Now, the binary digits can be used to encode things easier to work with. For example, an XML file can encode anything: It’s a text based way to represent and encode information. However, it’s merely a subset of the wider “text” encoding, i.e. using binary digits to encode text characters using ASCII or UTF-8. Since an XML file is the structure, and it’s not strictly forever related to a specific encoding, these are specified separately.
Here’s some tree work:
- Content
- Block
- BinaryBlock
- Text
- XML
But what about a text-only PDF? Is Text plain-text or anything textual including an image? Also, why can’t a binary sequence not be a Block, e.g. like Text doesn’t have to be a Block in general? IDEA: Every file can be opened with a text editor, even if it’s not meant to be text, and some portions may be readable. So the point here is to say “this file is safe for opening with a text editor”, and not “thing file contains only text, even it this text is rendered as a JPG image”. So Text will be the “plain text” meaning like the text/* group of MIME types.
Let’s try to rework the tree. First the binary issue: done moving this to diagram
+ Content
+ BinarySequence
+ BinaryBlock
+ Block
@ BinaryBlock
So BinaryBlock has two superclasses, and a binary sequence is now a separate concept not tied to Blocks.
The text issue: done moving this to diagram
- Content
- Document
- OpenDocument
- M$ DOC
- M$ DOCX
- Abiword document
- Text
- XML + RDF + SVG (also an image)
- YAML + JSON