Mirror of the Rel4tion website/wiki source, view at <http://rel4tion.org>
Clone
HTTPS:
git clone https://vervis.peers.community/repos/yEzqv
SSH:
git clone USERNAME@vervis.peers.community:yEzqv
Branches
Tags
3.mdwn
[[!template id=ticket class=task done=yes]] [[!tag /projects/language-kort/decisions]]
[[!meta title=“Model and framework Uid access in code”]]
Issue
There are two ways to represent a typed value in the Haskell code: Either with the type specified statically, known to the compiler, or with the type specified as a Uid. In the second case, the value itself must have a generic representation, e.g. a String, and type specific operations (such as number arithmetics) are impossible.
[[/projects/smaoin-hs]] supports statically typed values in the data model. [[/projects/language-kort]]’s parse tree doesn’t. When converting between the model and the parse tree, it’s necessary to be able to:
- Determine the Uid of a given primitive Smaoin type
- Determine the data constructor and conversion required for a given value string with a type Uid
These conversions should probably be in [[/projects/razom-text-util]].
For these conversions to be possible, the Haskell code must have access to the Uids of the Smaoin primitive types, and be able to match them with static recognition of value types. How should think work exactly?
Process
The plain naive solution is to write the Uids in a list, directly in Haskell code. But this solution raises questions:
- Should the list be auto-updated when needed, and how?
- Why write in Haskell, instead of querying a parsed data document written in source form (i.e. Idan)?
- Should Uids be provided for all namespaces, or just Kadma, or just smaoin, or just Smaoin primitive types?
Similar cases to examine are:
- The Unicode character database (UCD) based sheets provided as Haskell functions in several packages (some use C functions, but then the sheets are provided with the C libraries used)
- RDF resource URIs provided as variables in Java RDF frameworks
- GUI widget identifier (and other resources, like strings) variables auto-generated for andr0id Java app projects
Now some answers and thoughts.
List Updates
The list must either be the source document itself, or some other representation like Haskell code that is auto-generated. It’s possible to update the sheets when the source changes, and make a new release of the sheet package. For example, each [[/projects/Kadma]] release could have a matching release of a Haskell package containing the Uid sheets.
Conclusion: Everything except for the Idan source must be auto-generated.
Source Querying
Source querying can generally work. Actually, that’s what you’re supposed to do most of the time, right below the triplestore layer. Or you just query a triplestore. But there are two reasons not to do so in certain cases:
- If you need a specific resource, you can avoid going through complicated queries and parsers and B+ trees and lots of dependency packages and their recursive dependencies… by “caching” the Uid you need in Haskell code. Queries are generally meant for capturing entities for which some condition holds, while what we want here is to capture a specific entity.
- You may be writing the framework code itself. Source querying means that the whole multi-layer framework must not need any specific Uids, because that would create a cyclic dependency. The framework code needs a Uid, and getting the Uid requires using the tools supplied by the framework itself. For example, querying the Idan source for Smaoin primitive types in real time for Kort model conversions (which includes parsing) requires that Idan’s implementation doesn’t do that. It also creates an ugly wrong dependency.
Of course we can write the source in Kort, but then the Kort parser has a problem. And if we use some other format, like CSV, then we can as well use Haskell itself!
What about all the other parts? For example, a triplestore implementation. If it needs a specific Uid, can it read it from Idan source? Yes, it can, but then we have the first issue mentioned above, and the dependency on the Idan parser just for this simple little thing.
Conclusion: For all the framework parser code, we must use auto-generated Uids and they should be generated as Haskell source. For framework code that isn’t parsers, we can technically use source file querying, but there are issues of depedencies and overhead of ugly long data pipelines. It seems a good idea to prepare Haskell sheets for use by at least the framework code.
Scope
Which Uids should be generated in Haskell sheets, and where should they be placed in the module hierarchy?
There can be one one top-level module for the Uids, and under it submodules based on the namespace names. For example, the Uid of myns:Thing
would be under Data.Smaoin.Uid.Myns
.
Ideas for top-level module name:
Data.Smaoin.Uid
Data.Uid
Data.Resource
Data.Smaoin.U
Data.Smaoin.R
Data.Kadma
The problem is the name. Identifiers starting with an uppercase letter are used in Haskell for module names, type names and data constructors. Here are ideas:
- Use lowercase names. For example,
Myns.thing
. This can become a problem if there are identical names modulo letter case, for example there could be a classCountry
and a propertycountry :: Person -> Country
. It’s also a bit confusing to use lowercase for all labels in the Haskell code, because in Smaoin the case is used, at least in English, to distingish easily between Classe and Objects. - Use a prefix. Since in the English localization uppercase-starting names are used for classes, perhaps a ‘c’ prefix would make sense. For example,
Myns.cThing
. But this has issues too. It’s ugly and less intuitive with this weird ‘c’ there, and could cause issues with name similarity. For example - yes, I know it’s an unusual one - there could be a labelcLanguage
whose resource refers to the C programming language. Then the identifier of it is identical tocLanguage
, the Smaoin classLanguage
. Can get very cConfusing. - Use a leading underscore. For example,
Myns._Thing
. Yes, it works.
What about packages? One package per namespace or one huge package?
First, since namespaces and ontologies can be created independently, it must be okay to bring sheets from any package. The real question here is whether the definitions of Kadma should be:
- in one separate package?
- or in several per-namespace packages?
- or inside the
smaoin
package?
I think not mixing is a good idea. Release versions and so on. Keep the auto generated part separate. I could add sheet related utilities in the future, like the Unicode packages provides utility functions and not just plain lists or tuples, but those may be okay to add to the sheet package. For now, it’s just the sheets.
The name must reflect that:
- It’s a package of resources, not functionality
- These are Smaoin resource Uids, not images or audio etc.
- These are resources from Kadma, the Smaoin framework: just them, and all of them
Name suggestions:
smaoin-resources
kadma-resources
kadma
ontology-smaoin
vocalbulary-smaoin
smaoin-vocabulary
vocabulary-kadma
What about the type of the identifiers? String, Text, ByteString, Resource?
Since the canonical form is that of Data.Smaoin
, it seems to be the best is to use it as-is, i.e. through Resource
s.
Conversions
Some things are similar between Kort and Idan, or close to similar. So I put them in razom-text-util
. But now I’m writing the conversions, and it doesn’t feel right. Why would the Kort-specific value regexes be useful for anything except Kort itself? Idan has differences, so it will need its own regexes anyway.
There may be some accepted/standard/universal/de-facto regexes for Smaoin values, but I don’t see a reason to put Kort regexes in a utility package while Idan maintains its own regexes. Let’s move them to language-kort
.
Decisions
Package to provide the conversions: The general parts in razom-text-util
, the rest in language packages.
Scope of Uids provided: Whatever the framework needs, starting with the specific ones required for Kort’s implementation.
Uids provided as: Haskell packages, auto-generated from their Idan sources. Kadma will have one such packages, and ontologies etc. can be provided in separate packages or inside related software packages. Start with hand written definitions, and implement auto-generation later. It needs an Idan parser anyway.
Package name for Kadma Uids: vocabulary-kadma
.
Content of the sheets: Any entities are allowed, not just Uids. In other words, values can be used too. For example, numeric constants. Also, the resources specified can be “data objects”, not just ontology concepts.
Top level module for Uids: Data.Smaoin.Vocabulary
. The last part could become Vocab
but people will probably import is qualified as V
anyway.
Data type for Uids: smaoin
’s Resource
.
Submodule naming: For now, take the namespace prefix and make the first letter uppercase. Maybe when I see some names I’ll prefer to use some kind of CamelCase style.
Identifier naming: For labels starting with a lowercase letter, use the label as an identifer. For labels starting with an uppercase letter, i.e. Smaoin classes, prefix the label with an underscore (_
) to get the Haskell identifier.
Plan
- Create project
vocabulary-kadma
and put the definitions needed for Kort there, hand-written, copied from the Kadma Idan sources. Make release 0.1.0.0 of it. For now releases will have the regular 4 digit scheme. Maybe later they will just follow the release numbers of Kadma itself. - Implement in
razom-text-util
andlanguage-kort
the conversions needed for Kort, and make it build withsmaoin
0.2.0.0 - Finish implementing
language-kort
, write tests, make a release
Results
Done.