Unconventional rdbms applications

#1 · June 18, 2019, 8:12 pm

Expanding on Anthony's comment

b) there's no 'natural' business identifier that could be used as key.

The AST for a language parser.
The chemical bonding structure for organic/long chain molecules.

suppose such a key exists, how useful would it be? In my experience, when querying parse trees the basic query primitive is the predicate:

"is grammar symbol abc recognized at the interval [x,y]?"

with intervals identified to parse tree nodes. It is impossible to aggregate all the answers for such questions for a given AST/parse tree into a primitive scalar value.

Unlike case #1, it is much harder to offer a compelling database design for chemistry. Somewhat counter intuitively, however, database modelling of organic chemistry in general, and biochemistry, in particular, might be easier. For example proteins are words in 20 or so letter alphabet, which prompts for parsing technologies...

Does, anybody here has any experience working with "complex domain" RDBMS applications?

#2 · June 19, 2019, 12:35 am

Quote from AntC on June 19, 2019, 12:35 am

Quote from Tegiri Nenashi on June 18, 2019, 8:12 pm

Expanding on Anthony's comment

b) there's no 'natural' business identifier that could be used as key.

The AST for a language parser.

The chemical bonding structure for organic/long chain molecules.

Hi Tegiri, I mentioned those two particularly because they're good examples for representing as tagged unions. What identifies each node is 'naturally' its position in the tree or long chain, and that only.

If you want to persist chemical bondings, typically the tree is serialised to XML/JSON. An AST is typically ephemeral: used only as a step to generate object code from a compiler. (And the original program/text source is the persistence layer.)

suppose such a key exists, how useful would it be? In my experience, when querying parse trees the basic query primitive is the predicate:

"is grammar symbol abc recognized at the interval [x,y]?"

Yes, parse trees typically contain metadata about which piece of text they're a parse of. (That's to show in error messages, for example.) But assuming the source program is syntactically valid and well-typed, the next step is to transform the tree for optimisations/code generation; then sub-terms might get 'floated out'/an optimisation might 'push restrict through join'/terms might get rewritten to semantically equivalent expressions/etc; then the transformed tree no longer relates to the serial text.

with intervals identified to parse tree nodes. It is impossible to aggregate all the answers for such questions for a given AST/parse tree into a primitive scalar value.

Not quite sure what you mean by a "primitive" scalar value? TTM has scalars (of which some are system-defined) and non-scalars. If you were trying really hard to represent the parse tree in a TTM-style database, you might use RVAs to represent the nesting of sub-terms. Or if you could identify a key for each node, you could 'flatten' with a spider's web of foreign keys.

Unlike case #1, it is much harder to offer a compelling database design for chemistry. Somewhat counter intuitively, however, database modelling of organic chemistry in general, and biochemistry, in particular, might be easier. For example proteins are words in 20 or so letter alphabet, which prompts for parsing technologies...

As I understand it, that letter-encoding is a sort of shorthand, that works because proteins have a regular structure/sequence. Organic chemicals in general exhibit a diversity of (crystalline) structures that can't necessarily be represented as a sequence.

Does, anybody here has any experience working with "complex domain" RDBMS applications?

Yes, everybody here. Depending on what you mean by "complex domain". I've worked chiefly on ERPs for process manufacturing (which stretches as far as patient flows through hospitals) and distribution/stock control, oh and multi-dimensional/multi-currency accounting (which I'd describe as relatively non-complex).

Quote from Tegiri Nenashi on June 18, 2019, 8:12 pm

Expanding on Anthony's comment

b) there's no 'natural' business identifier that could be used as key.

The AST for a language parser.

The chemical bonding structure for organic/long chain molecules.

Hi Tegiri, I mentioned those two particularly because they're good examples for representing as tagged unions. What identifies each node is 'naturally' its position in the tree or long chain, and that only.

If you want to persist chemical bondings, typically the tree is serialised to XML/JSON. An AST is typically ephemeral: used only as a step to generate object code from a compiler. (And the original program/text source is the persistence layer.)

suppose such a key exists, how useful would it be? In my experience, when querying parse trees the basic query primitive is the predicate:

"is grammar symbol abc recognized at the interval [x,y]?"

Yes, parse trees typically contain metadata about which piece of text they're a parse of. (That's to show in error messages, for example.) But assuming the source program is syntactically valid and well-typed, the next step is to transform the tree for optimisations/code generation; then sub-terms might get 'floated out'/an optimisation might 'push restrict through join'/terms might get rewritten to semantically equivalent expressions/etc; then the transformed tree no longer relates to the serial text.

with intervals identified to parse tree nodes. It is impossible to aggregate all the answers for such questions for a given AST/parse tree into a primitive scalar value.

Not quite sure what you mean by a "primitive" scalar value? TTM has scalars (of which some are system-defined) and non-scalars. If you were trying really hard to represent the parse tree in a TTM-style database, you might use RVAs to represent the nesting of sub-terms. Or if you could identify a key for each node, you could 'flatten' with a spider's web of foreign keys.

Unlike case #1, it is much harder to offer a compelling database design for chemistry. Somewhat counter intuitively, however, database modelling of organic chemistry in general, and biochemistry, in particular, might be easier. For example proteins are words in 20 or so letter alphabet, which prompts for parsing technologies...

As I understand it, that letter-encoding is a sort of shorthand, that works because proteins have a regular structure/sequence. Organic chemicals in general exhibit a diversity of (crystalline) structures that can't necessarily be represented as a sequence.

Does, anybody here has any experience working with "complex domain" RDBMS applications?

Yes, everybody here. Depending on what you mean by "complex domain". I've worked chiefly on ERPs for process manufacturing (which stretches as far as patient flows through hospitals) and distribution/stock control, oh and multi-dimensional/multi-currency accounting (which I'd describe as relatively non-complex).

#3 · June 19, 2019, 1:07 am

Quote from AntC on June 19, 2019, 12:35 am

Quote from Tegiri Nenashi on June 18, 2019, 8:12 pm

Does, anybody here has any experience working with "complex domain" RDBMS applications?

Yes, everybody here. Depending on what you mean by "complex domain". I've worked chiefly on ERPs for process manufacturing (which stretches as far as patient flows through hospitals) and distribution/stock control, oh and multi-dimensional/multi-currency accounting (which I'd describe as relatively non-complex).

Don't you mean, nobody here?

My impression is that most, if not all, of us regulars have a background mainly working with SQL and canonical SQL types for relatively conventional business/enterprise applications. These would generally not be considered to have complex domains, assuming "complex domain" means a non-trivial type.

Complex domains/types are something that we (and the TTM-related books by Date & Darwen) have sometimes touched on in glancing and largely hypothetical terms, but I don't recall anyone posting about intending to use/develop a D to store and retrieve images based on similarity, or store and retrieve DNA sequences, or implement text retrievals based on semantic connections (a few years ago, I did build a non-relational search engine product based on this), or store and retrieve protein complexes, etc.

By the way, I certainly don't mean to deprecate conventional business/enterprise applications. Not at all. I'm only pointing out where our experience appears to mainly lie.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#4 · June 19, 2019, 6:51 am

Quote from AntC on June 19, 2019, 6:51 am

Quote from Dave Voorhis on June 19, 2019, 1:07 am

Quote from AntC on June 19, 2019, 12:35 am

Quote from Tegiri Nenashi on June 18, 2019, 8:12 pm

Does, anybody here has any experience working with "complex domain" RDBMS applications?

Yes, everybody here. Depending on what you mean by "complex domain". I've worked chiefly on ERPs for process manufacturing (which stretches as far as patient flows through hospitals) and distribution/stock control, oh and multi-dimensional/multi-currency accounting (which I'd describe as relatively non-complex).

Don't you mean, nobody here?

My impression is that most, if not all, of us regulars have a background mainly working with SQL and canonical SQL types for relatively conventional business/enterprise applications. These would generally not be considered to have complex domains, assuming "complex domain" means a non-trivial type.

Ah, from the wording of Tegiri's question, I took it that wasn't the technical/database/algebraic sense of "domain" as data type; but as in 'business domain' or 'functional domain'.

Complex domains/types are something that we (and the TTM-related books by Date & Darwen) have sometimes touched on in glancing and largely hypothetical terms, but I don't recall anyone posting about intending to use/develop a D to store and retrieve images based on similarity, or store and retrieve DNA sequences, or implement text retrievals based on semantic connections (a few years ago, I did build a non-relational search engine product based on this), or store and retrieve protein complexes, etc.

By the way, I certainly don't mean to deprecate conventional business/enterprise applications. Not at all. I'm only pointing out where our experience appears to mainly lie.

If the q was in the sense of 'business domain', why should enterprise applications exhibit any less complexity than DNA sequencing? We don't need to be apologetic or imagine there are database structuring requirements beyond our compass.

Just because research scientists look down their noses at SQL doesn't mean their software tools are any better. Most of them seem to hack in Perl forgawdssake; and seem to think that if a statistical tool finds a multivariate correlation, they've 'discovered' some insight other than the biases and ignorance that went into collecting the data.

Quote from Dave Voorhis on June 19, 2019, 1:07 am

Quote from AntC on June 19, 2019, 12:35 am

Quote from Tegiri Nenashi on June 18, 2019, 8:12 pm

Does, anybody here has any experience working with "complex domain" RDBMS applications?

Yes, everybody here. Depending on what you mean by "complex domain". I've worked chiefly on ERPs for process manufacturing (which stretches as far as patient flows through hospitals) and distribution/stock control, oh and multi-dimensional/multi-currency accounting (which I'd describe as relatively non-complex).

Don't you mean, nobody here?

My impression is that most, if not all, of us regulars have a background mainly working with SQL and canonical SQL types for relatively conventional business/enterprise applications. These would generally not be considered to have complex domains, assuming "complex domain" means a non-trivial type.

Ah, from the wording of Tegiri's question, I took it that wasn't the technical/database/algebraic sense of "domain" as data type; but as in 'business domain' or 'functional domain'.

Complex domains/types are something that we (and the TTM-related books by Date & Darwen) have sometimes touched on in glancing and largely hypothetical terms, but I don't recall anyone posting about intending to use/develop a D to store and retrieve images based on similarity, or store and retrieve DNA sequences, or implement text retrievals based on semantic connections (a few years ago, I did build a non-relational search engine product based on this), or store and retrieve protein complexes, etc.

By the way, I certainly don't mean to deprecate conventional business/enterprise applications. Not at all. I'm only pointing out where our experience appears to mainly lie.

If the q was in the sense of 'business domain', why should enterprise applications exhibit any less complexity than DNA sequencing? We don't need to be apologetic or imagine there are database structuring requirements beyond our compass.

Just because research scientists look down their noses at SQL doesn't mean their software tools are any better. Most of them seem to hack in Perl forgawdssake; and seem to think that if a statistical tool finds a multivariate correlation, they've 'discovered' some insight other than the biases and ignorance that went into collecting the data.

#5 · June 19, 2019, 7:00 am

Quote from AntC on June 19, 2019, 6:51 am

Just because research scientists look down their noses at SQL doesn't mean their software tools are any better. Most of them seem to hack in Perl forgawdssake; and seem to think that if a statistical tool finds a multivariate correlation, they've 'discovered' some insight other than the biases and ignorance that went into collecting the data.

Nothing wrong with hacking in Perl. That's one of the most popular languages for doing DNA analysis.

The Forum for Discussion about The Third Manifesto and Related Matters

Unconventional rdbms applications