The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Meaning of a relation

Hi! Before I muddle into what may be real questions or perhaps just muddled thinking or rubber-ducking, let me introduce myself.

My name is Torbjörn (or Tobe) and I have been programming for 40 years in varying fields: cryptanalysis, customs & excise, real-time video-calls and telephony & networks. I am not very academic although I was extremely close to completing my bachelors in Mathematics and Mathematical Statistics and have read about computer-sciency things as interest and need has taken me.

Many years ago, close to the turn of the century, I read a short article on something called "rel" that was supposedly much more mathematicaly correct than SQL and filed it away in my head as something to follow up on at a future time. At about the same time I was hugely enjoying working with XSLT and started having an idea of creating a programming language that allowed programming in an XSLT-like way with template-like transforms, and that idea lay and baked for about 15 years.

In 2018, inspired by a colleague who was using SQL to implement some business logic (nobody else understood any of it), I decided to try to do the AdventOfCode in SQL (on mariadb) to hopefully learn something interesting. It turned out to be the wrong year for that with very large assignments and lots of iteration, but I did 9 head-hurting and mind-expanding days. As a side note, I had the idea of maybe doing 2020 in Rel, but it seemed it might be a little hard to parse the input in Rel.

Anyway, at that time I finally got the impetus to start developing a syntax for my language, and the following year I had a working interpreter. There has always been an idea that I wanted to provide relations as a native construct, perhaps instead of Maps and Sets, and I have now started looking at some of the 2020 AdventOfCode problems and re-implementing them using relational algebra, adding syntax for it to Tailspin, as my language is called.

But to implement, I have to understand things well enough to not go counter to the logic but also be able to provide a clear and usable syntax, which brings me to my current confusion while reading up on Algebra A.

I see how the <or> operator functions in a way that would seem to comply with de Morgan's laws and it's quite elegant to be able to do everything with NAND (as when dabbling in electronics). But what does it really mean to take one relation <or> another?

When I've used "union", it has always meant "Oh, btw, these things are also true in the same way but derived from other sources", which doesn't really seem like "or", so is <or> really a good name?

Considering the example in A on <or>, assuming we have departments [B,C,D] and projects [J,K,L], when taking {{employee: E, department: D}} <or> {{employee: E, project:J}} we would get, IIUC:

{{employee: E, department: D, project: J},{employee: E, department: D, project: K},{employee: E, department: D, project: L},{employee: E, department: B, project: J},{employee: E, department: C, project: J}}

where each tuple should mean "the employee works in the given department OR on the given project". But the tuple is indistinguishable from a tuple with the normal(?) interpretation "the employee works in the department on the given project". So what is missing? How do we keep track of when indistinguishable things have different meanings and what do these different meanings imply for further relational operations?

If I now project on employee and department, that tells me pretty much nothing, doesn't it? Even worse, I now have to accept that not every tuple is true, I now have to interpret the relation itself as having <or> between the tuples, so to speak. Which actually works with the "normal" interpretation of the tuples above, so maybe that should have been the interpretation all along, i.e. at least one of these tuples is true? But again, how to distinguish that from the interpretation that all tuples are true and what does it imply for further operations?

Quote from tobega on January 10, 2021, 8:54 pm

... Many years ago, close to the turn of the century, I read a short article on something called "rel" that was supposedly much more mathematicaly correct than SQL and filed it away in my head as something to follow up on at a future time.

Around 2004?

That was when I first released Rel.

At about the same time I was hugely enjoying working with XSLT ...

You're the first person I've heard of to admit enjoying XSLT. :-)

Apologies for not (yet) addressing the core of your post, but it's late. I'll try to address it tomorrow, assuming someone else doesn't cover the same territory first.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from tobega on January 10, 2021, 8:54 pm

My name is Torbjörn (or Tobe) and I have been programming for 40 years in varying fields: cryptanalysis, customs & excise, real-time video-calls and telephony & networks. I am not very academic although I was extremely close to completing my bachelors in Mathematics and Mathematical Statistics and have read about computer-sciency things as interest and need has taken me.

You should be right at home here.

Many years ago, close to the turn of the century, I read a short article on something called "rel" that was supposedly much more mathematicaly correct than SQL and filed it away in my head as something to follow up on at a future time. At about the same time I was hugely enjoying working with XSLT and started having an idea of creating a programming language that allowed programming in an XSLT-like way with template-like transforms, and that idea lay and baked for about 15 years.

Dave will say this better, but you misunderstood. The program 'Rel' is an implementation of a language called 'Tutorial D', which is described in the writings of Darwen and Date as a tutorial language, which in turn largely conforms to the 'D' language specification provided in the Third Manifesto, which (amongst other things) implement the relational algebra of Codd, on which SQL was originally but not faithfully based. In short, the RA is 'mathematically correct', and Rel/TD is closer to the RA than is SQL.

In 2018, inspired by a colleague who was using SQL to implement some business logic (nobody else understood any of it), I decided to try to do the AdventOfCode in SQL (on mariadb) to hopefully learn something interesting. It turned out to be the wrong year for that with very large assignments and lots of iteration, but I did 9 head-hurting and mind-expanding days. As a side note, I had the idea of maybe doing 2020 in Rel, but it seemed it might be a little hard to parse the input in Rel.

These are not good languages for the purpose. The parsing is easy (just embed the strings by hand), but the best choice is usually a language with heaps of pre-written library functions: Python, Java/C#, JS. But you will certainly learn the language, the hard way.

Anyway, at that time I finally got the impetus to start developing a syntax for my language, and the following year I had a working interpreter. There has always been an idea that I wanted to provide relations as a native construct, perhaps instead of Maps and Sets, and I have now started looking at some of the 2020 AdventOfCode problems and re-implementing them using relational algebra, adding syntax for it to Tailspin, as my language is called.

There is no single agreed relational algebra. My version of the (Extended) RA has 9 query operators: SPJRUN (select, project, rename, join, union, not), new, while and aggregate, plus assignment.

What's yours?

But to implement, I have to understand things well enough to not go counter to the logic but also be able to provide a clear and usable syntax, which brings me to my current confusion while reading up on Algebra A.

Algebra A has very low level operations intended to be combined into shorthands that constitute a language. IIRC, depending on headings, <and> is join, intersect or cross product and <or> is union.

I see how the <or> operator functions in a way that would seem to comply with de Morgan's laws and it's quite elegant to be able to do everything with NAND (as when dabbling in electronics). But what does it really mean to take one relation <or> another?

When I've used "union", it has always meant "Oh, btw, these things are also true in the same way but derived from other sources", which doesn't really seem like "or", so is <or> really a good name?

The RA union of two relations with the same heading is all the tuples from both with duplicates removed. The RA does not define it for relations with different headings.

Considering the example in A on <or>, assuming we have departments [B,C,D] and projects [J,K,L], when taking {{employee: E, department: D}} <or> {{employee: E, project:J}} we would get, IIUC:

This is not a valid operation in the RA. Best not to go there.

{{employee: E, department: D, project: J},{employee: E, department: D, project: K},{employee: E, department: D, project: L},{employee: E, department: B, project: J},{employee: E, department: C, project: J}}

where each tuple should mean "the employee works in the given department OR on the given project". But the tuple is indistinguishable from a tuple with the normal(?) interpretation "the employee works in the department on the given project". So what is missing? How do we keep track of when indistinguishable things have different meanings and what do these different meanings imply for further relational operations?

If I now project on employee and department, that tells me pretty much nothing, doesn't it? Even worse, I now have to accept that not every tuple is true, I now have to interpret the relation itself as having <or> between the tuples, so to speak. Which actually works with the "normal" interpretation of the tuples above, so maybe that should have been the interpretation all along, i.e. at least one of these tuples is true? But again, how to distinguish that from the interpretation that all tuples are true and what does it imply for further operations?

For your purposes, I would recommend sticking to the 10 operators listed above. They can do everything in SQL as well as everything in Rel/TD and more besides. That will keep you busy for many months.

 

Andl - A New Database Language - andl.org
Quote from Dave Voorhis on January 10, 2021, 10:41 pm
Quote from tobega on January 10, 2021, 8:54 pm

... Many years ago, close to the turn of the century, I read a short article on something called "rel" that was supposedly much more mathematicaly correct than SQL and filed it away in my head as something to follow up on at a future time.

Around 2004?

That was when I first released Rel.

Yes, that would be about right.

At about the same time I was hugely enjoying working with XSLT ...

You're the first person I've heard of to admit enjoying XSLT. :-)

Well, xml is a pain to work with any other way ;-)

Apologies for not (yet) addressing the core of your post, but it's late. I'll try to address it tomorrow, assuming someone else doesn't cover the same territory first.

 

From dandl:

"Algebra A has very low level operations intended to be combined into shorthands that constitute a language. IIRC, depending on headings, <and> is join, intersect or cross product and <or> is union."

To be clear, A is not intended for implementation.  It is not computationally "safe".  It was designed to make it easier to define relational operators in a computationally feasible language such as Tutorial D, and we did indeed use it for that purpose.  A study of A will help you to study Tutorial D.  It might also give you some insight into the "meaning of a relation".  That was my idea in using operator names taken from propositional calculus.

Hugh

Coauthor of The Third Manifesto and related books.

Anyway, at that time I finally got the impetus to start developing a syntax for my language, and the following year I had a working interpreter. There has always been an idea that I wanted to provide relations as a native construct, perhaps instead of Maps and Sets, and I have now started looking at some of the 2020 AdventOfCode problems and re-implementing them using relational algebra, adding syntax for it to Tailspin, as my language is called.

There is no single agreed relational algebra. My version of the (Extended) RA has 9 query operators: SPJRUN (select, project, rename, join, union, not), new, while and aggregate, plus assignment.

What's yours?

Well, I'm here, so I suppose my first approximation would aim at creating a "D", which I suppose leads me to a set-based, null-free RA, and probably "A", essentially, but I'm happy to consider other options if it helps make more semantic sense.

Thinking a bit more about this, I think it is certainly fine to have a relation of the kind "one of these tuples is true", as for example I would be perfectly comfortable with having a set of possible values for a sudoku square.

That in turn tells us that there are at least two types of relations with respect to logic, "all of" and "one of" and it might be interesting to see how that categorization plays out under different operations.

I am thinking of analogies to numbers having units, where it only makes sense to add like units and multiply by unitless scalars, while multiplying by other units changes the unit.

Quote from tobega on January 11, 2021, 12:05 pm

Anyway, at that time I finally got the impetus to start developing a syntax for my language, and the following year I had a working interpreter. There has always been an idea that I wanted to provide relations as a native construct, perhaps instead of Maps and Sets, and I have now started looking at some of the 2020 AdventOfCode problems and re-implementing them using relational algebra, adding syntax for it to Tailspin, as my language is called.

There is no single agreed relational algebra. My version of the (Extended) RA has 9 query operators: SPJRUN (select, project, rename, join, union, not), new, while and aggregate, plus assignment.

What's yours?

Well, I'm here, so I suppose my first approximation would aim at creating a "D", which I suppose leads me to a set-based, null-free RA, and probably "A", essentially, but I'm happy to consider other options if it helps make more semantic sense.

Yes, the aim is to create a 'D'. In doing that you have to decide on:

  • a type system: roughly as per TTM, although there is room for some variation.
  • a set of relational operators: not explicitly defined by TTM, 6-9 query operators, plus assignment.
  • a set of general language features: don't invent the wheel, copy something you know and like.
  • a set of database-specific features: roughly as per TTM, mainly to do with the catalog and operations on it.

Its a big job, probably 6-12 months. Are you up for it? There are easier roads to follow.

Andl - A New Database Language - andl.org
Quote from tobega on January 10, 2021, 8:54 pm

Welcome Tobe.

... Algebra A.

I see how the <or> operator functions in a way that would seem to comply with de Morgan's laws and it's quite elegant to be able to do everything with NAND (as when dabbling in electronics). But what does it really mean to take one relation <or> another?

(I don't see any other reply has answered that q so far. Apologies if I missed it.) First we need to know what each of the participating relations mean. In the true RA (not SQL), we give a predicate for each relation. That predicate must refer only to the attributes in the relation. We then take the tuples present in the relation, substitute each tuple's attribute values into the predicate, and get a load of statements that the relation is asserting to be true. No relation has meaning without knowing what that predicate is.

When I've used "union", it has always meant "Oh, btw, these things are also true in the same way but derived from other sources", which doesn't really seem like "or", so is <or> really a good name?

Yes. Indeed <OR> is just about the only possible name. SQL's UNION is not <OR>.

Considering the example in A on <or>, assuming we have departments [B,C,D] and projects [J,K,L], when taking {{employee: E, department: D}} <or> {{employee: E, project:J}} we would get, IIUC:

{{employee: E, department: D, project: J},{employee: E, department: D, project: K},{employee: E, department: D, project: L},{employee: E, department: B, project: J},{employee: E, department: C, project: J}}

where each tuple should mean "the employee works in the given department OR on the given project". But the tuple is indistinguishable from a tuple with the normal(?) interpretation "the employee works in the department on the given project". So what is missing?

There's a common misconception that heading alone is sufficient to determine meaning. There might be several relvars with heading [editted] {employee, department, project}. They do not necessarily have the same predicate. So yes <OR> of two relations has the same heading as <AND> of them. Same heading doesn't tell you anything. Addit: compare relation values 'Suppliers in London', 'Suppliers at Status 20', 'Suppliers (in London OR at Status 20)' all have the same heading, but don't mean the same thing.

How do we keep track of when indistinguishable things have different meanings and what do these different meanings imply for further relational operations?

You don't keep track merely by headings. In a program you might have two variables both type Int. How do you distinguish them? That's the same question.

If I now project on employee and department, that tells me pretty much nothing, doesn't it?

A projection of a <OR> means something different to a projection of an <AND> because the expressions you're projecting over have different meanings.

Even worse, I now have to accept that not every tuple is true, I now have to interpret the relation itself as having <or> between the tuples, so to speak.

No. Always the logical connective between tuples in the same relation is AND -- by which I don't mean the A algebra's <AND>. That is, AND between the assertions formed by substituting each tuple's attribute values into the relation's predicate.

Which actually works with the "normal" interpretation of the tuples above, so maybe that should have been the interpretation all along, i.e. at least one of these tuples is true? But again, how to distinguish that from the interpretation that all tuples are true and what does it imply for further operations?

Careful: the meaning of relation value (sets of tuples) produced from a query is not necessarily "true". The meaning is derived from the logical connectives we get from the operations in the query. If the query includes a <NOT> or a MINUS or EXCEPT or NOT MATCHING, etc, you have to be very careful to follow through the logic as to what "true" means.

Quote from tobega on January 11, 2021, 4:26 pm

Thinking a bit more about this, I think it is certainly fine to have a relation of the kind "one of these tuples is true",

No that's not a relation. Precisely because operations on the contents of the relation rapidly descend into logical contradictions.

as for example I would be perfectly comfortable with having a set of possible values for a sudoku square.

Then your predicate is "Cell at x, y has possible value v". (Or "For Cell at x, y, we can't yet eliminate v as a value.") What you can't do (under the RA) is interpret the set of tuples to mean exactly one of the tuples for cell x, y holds the correct v, in which case the others don't.

That in turn tells us that there are at least two types of relations with respect to logic, "all of" and "one of" and it might be interesting to see how that categorization plays out under different operations.

No that's not a relation -- or at least not in the TTM sense (nor the usual RA sense). There are other interpretations -- see "Open world assumption" vs "Closed" in D&D's texts, or any good text.

I am thinking of analogies to numbers having units, where it only makes sense to add like units and multiply by unitless scalars, while multiplying by other units changes the unit.

I don't see that as a useful analogy. What you're talking about there is a type system that is units-aware.