The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Clarification re local relvar and KEY

PreviousPage 3 of 5Next
Quote from dandl on June 6, 2020, 1:27 am
Quote from Dave Voorhis on June 5, 2020, 3:45 pm
Quote from dandl on June 5, 2020, 2:33 pm
Quote from Dave Voorhis on June 5, 2020, 8:00 am
Quote from dandl on June 5, 2020, 6:03 am

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

I must be missing some essential point, because (a) increasingly I realise I don't understand what you're suggesting or why it's a good thing, and (b) each of your examples seems to have gone from simple to peculiar. E.g., why does the C# 'WHERE' example need a string argument of "CITY"? Etc.

Also, RENAME should support wildcards. Don't forget the wildcards.

I can only show you regular simplified syntax and how it maps to C#. I can't make you see it's a good thing.

The C# example expresses headings as strings because they're a convenient data type for the purpose. The example shows an anonymous function in lambda form acting as a relcon; the heading names the attributes. Here is a longer one.

var S = S.WHERE("CITY,STATUS", (a1,a2) => a1 != "Athens" && a2 >= 20 && a2 <= 30); // C#

I still don't get it.

Can't you make it something like this, assuming S has a static tuple structure?

var result = S.WHERE(t => t.City != "Athens" && t.Status >= 20 && t.Status <= 30)

And something like this if S has a dynamic (i.e., runtime loaded) tuple structure?

var result = S.WHERE(t => t.getString("City") != "Athens" && t.getInt("Status") >= 20 && t.getInt("Status") <= 30)

No. The whole point is: this implementation relies on headings and not types. The first implementation I did (RelValue) was strongly typed and relied on declared tuple and relation types.You had to declare all the tuple types with attribute names and types, and then provided the heading matched the declaration it was fully type safe. However, it turns out that (a) it's a real pain to have to declare all those types and (b) you have to declare extra 'phantom' types (and headings) if you want to do things like REMOVE. Yes, with this implementation you can write something exactly like your suggestions.

This second implementation (RelNode) is weakly typed. Every tuple value is the same type, you don't have to declare anything (for the RA). All the attribute values are strongly typed of course, but there are no individual tuple types. Heading inference is done and checked at runtime. There are no 'open expressions' in restrict, extend and aggregate; they are replaced by App-A relcons defined by headings.

Yes, I'm still working on how best to implement the anonymous functions that represent relcons. Your suggestions are not possible, even in principle. The code you see in the example below uses casts, but please note: this is defining a relcon, not an 'open expression'. The supplied function sees the heading tuple, not the raw tuple from the outer scope. It's all about the headings.

There is a near-perfect mapping from TD as shown above, based on using headings as operator arguments. This is a full D (as per the Pre requirements), but is poor at picking up errors at compile time. For that, it needs a pre-processor. The nice thing is that it would output high quality debuggable C# (or Java, or anything you like).

I don't have a good sample right now, but this example of GTC gives the flavour. Please note that this is a complete sample. It will compile and run exactly as shown here, with the RelNode library. It is type safe, but it will throw runtime errors if the headings are wrong. You can see the direct correspondence with the code in DTATRM.

var mmqi = RelNode.Import(SourceKind.Csv, ".", "MMQ", "MajorPNo:text,MinorPNo:text,Qty:number");
var mmexp = mmqi
.Rename("Qty,ExpQty")
.While(TupWhile.F(tw => tw
.Rename("MinorPNo,zmatch")
.Compose(mmqi.Rename("MajorPNo,zmatch"))
.Extend("Qty,ExpQty,ExpQty", TupExtend.F(v => (decimal)v[0] * (decimal)v[1]))
.Remove("Qty")));
var mmagg = mmexp
.Aggregate("ExpQty,TotQty", TupAggregate.F((v, a) => (decimal)v + (decimal)a)));
var mmqi = RelNode.Import(SourceKind.Csv, ".", "MMQ", "MajorPNo:text,MinorPNo:text,Qty:number"); var mmexp = mmqi .Rename("Qty,ExpQty") .While(TupWhile.F(tw => tw .Rename("MinorPNo,zmatch") .Compose(mmqi.Rename("MajorPNo,zmatch")) .Extend("Qty,ExpQty,ExpQty", TupExtend.F(v => (decimal)v[0] * (decimal)v[1])) .Remove("Qty"))); var mmagg = mmexp .Aggregate("ExpQty,TotQty", TupAggregate.F((v, a) => (decimal)v + (decimal)a)));
 var mmqi = RelNode.Import(SourceKind.Csv, ".", "MMQ", "MajorPNo:text,MinorPNo:text,Qty:number");
 var mmexp = mmqi
    .Rename("Qty,ExpQty")
    .While(TupWhile.F(tw => tw
      .Rename("MinorPNo,zmatch")
      .Compose(mmqi.Rename("MajorPNo,zmatch"))
      .Extend("Qty,ExpQty,ExpQty", TupExtend.F(v => (decimal)v[0] * (decimal)v[1]))
      .Remove("Qty")));
  var mmagg = mmexp
     .Aggregate("ExpQty,TotQty", TupAggregate.F((v, a) => (decimal)v + (decimal)a)));

I'm afraid I still don't get it. What would you use it for?

How does it differ from the internal relational engine used in Andl or Rel or SIRA_PRISE or RAQUEL or Duro, etc.?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on June 6, 2020, 1:27 am

 

No. The whole point is: this implementation relies on headings and not types.

To quote Master Yoda once again : that is why you fail.

The "whole point" is : HEADINGS CANNOT EXIST WITHOUT TYPES.  That is HOW THEY ARE DEFINED.  As the INT X DATE X CHAR part of the specification of a cartesian product ***OF TYPES***.  Types are NEEDED because they are ***INPUT*** to the type generator.  Generators cannot do their job if the input isn't available.   Meaning the output of those generators cannot come into existence if the input isn't available.  And since nonscalar types, which are dependent on [the existence of] the Heading defining them, are the output of the generators we are talking about here, it means that headings (or the more useful construct called "noscalar types" that is associated with them on a rather 1-1 basis) cannot come into existence without the input existing a priori.  So no matter what you do or say, everything depends (or "relies", as you call it) on the existence of types.

If you think you have a different model that works, ***SHOW IT*** (either as a formalized model or else as a working implementation of it).  But pls be aware that handwavy superficialities like "this particular thing I do it like C# does it, and that other particular thing, I do it like Haskell does it, and this nifty little other thing, I do it like LISP does it" are not going to cut it here.

Author of SIRA_PRISE
Quote from Erwin on June 6, 2020, 9:28 pm
Quote from dandl on June 6, 2020, 1:27 am

 

No. The whole point is: this implementation relies on headings and not types.

To quote Master Yoda once again : that is why you fail.

The "whole point" is : HEADINGS CANNOT EXIST WITHOUT TYPES.  That is HOW THEY ARE DEFINED.  As the INT X DATE X CHAR part of the specification of a cartesian product ***OF TYPES***.  Types are NEEDED because they are ***INPUT*** to the type generator.  Generators cannot do their job if the input isn't available.   Meaning the output of those generators cannot come into existence if the input isn't available.  And since nonscalar types, which are dependent on [the existence of] the Heading defining them, are the output of the generators we are talking about here, it means that headings (or the more useful construct called "noscalar types" that is associated with them on a rather 1-1 basis) cannot come into existence without the input existing a priori.  So no matter what you do or say, everything depends (or "relies", as you call it) on the existence of types.

And this is where we differ. I'm presenting a model that differs from TTM in that the heading is promoted to exist as a stand-alone entity, to act as an argument to operators including the RA and selectors. In this role they do not need to specify a type, but a type will be inferred from the other arguments. Please note carefully: at the point where a tuple value is returned by an operator the heading and attribute values and types are available. If your model is to generate tuple types, they are generated at this point. Ditto for relations.

My preferred model is to treat tuple and relation types more as generics than individual stand-alone types but that's a separate argument.

If you think you have a different model that works, ***SHOW IT*** (either as a formalized model or else as a working implementation of it).  But pls be aware that handwavy superficialities like "this particular thing I do it like C# does it, and that other particular thing, I do it like Haskell does it, and this nifty little other thing, I do it like LISP does it" are not going to cut it here.

Yes, I have a model that works. This is my 'C# as D', which includes a full implementation of the RA based on headings, as above. It is fully compliant with TTM Pre requirements, other than multiple POSSREPs and the IM, as well as including some VSS. It evaluates RA expressions that are a line for line match with TD, plus GTC, iterated aggregation and while.

Currently it does not do some RM Pro, or heading inference at compile time; for that it would need a pre-processor and a language extension, but the generated code would look like this.

I would argue that it is a major step towards an Industrial D, right now the only kid on the block.

Andl - A New Database Language - andl.org

I'm afraid I still don't get it. What would you use it for?

How does it differ from the internal relational engine used in Andl or Rel or SIRA_PRISE or RAQUEL or Duro, etc.?

I thought the 'it' you didn't get was how it works. This is a different 'it'.

Quite obviously, the internal engine is much the same. The difference is that the D is a modern industrial strength general purpose programming language. If you want an Industrial D, this is how you get it.

It's not ready to be used yet, and I don't have anything I want to use it for. Maybe later. At present this is an investigative project but even at this stage I will say this: the first check-in on this project was 3rd May. In just over a month of part-time fiddling I have a D language that runs queries at least on a par with Andl (or Rel). The big gaps are a pre-processor (to enforce the TTM Pro requirements) and a storage engine (to implement the remainder of TTM). I have yet to decide whether to spend the time on doing those.

Andl - A New Database Language - andl.org
Quote from dandl on June 5, 2020, 6:03 am

...

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

Then this sounds something like Tropashko's approach: all operators take only relations as operands. If your operator needs to manipulate attribute names, give it an operand with the needed attribute names that is an empty relation (or an operand for which the operator will ignore the tuple content). The primitive operators include enough to give the union of two headings; their intersection; their difference; oh, and relation literals if you want an attribute name constant.

The syntax could be simplified a lot if all the operators were put into the same consistent form

Giving expressions in 'consistent form' would be something like using only the A primitive operators. Some people like (what we might call) reduced instruction set languages; some don't. I suspect it would defeat the Tutorial purpose of Tutorial D.

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

Yes I dislike the 'invisible operator' between S and juxtaposed { ... }. I particularly dislike the even more invisible operator when you discover {ALL BUT ...}.

Pointing this out, though, seems very much "about the syntax of one language". So you've failed to make your point a) above.

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

All you seem to be saying is you prefer C# style over Tutorial D style. Since I'm not familiar with C# (I can see it doesn't look like Haskell), I'm pretty meh on the whole topic. De gustibus non disputandem. I'm still only seeing bikeshedding about syntax.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

 

If you're trying to say something about how TTM typing/semantics fits into an existing programming language like C#/Java, I'd expect this thread to be talking about types/semantics. I see only syntax. Elsewhere in the thread you've giving examples in C#; I'm not going to learn a language just in case you're saying something radical/different in it (highly unlikely, on your record so far). So you've failed to demonstrate your point b) above.

Since you don't give the semantics that makes anything "all much the same", you've failed to demonstrate your point c).

Quote from dandl on June 7, 2020, 3:31 am

I'm afraid I still don't get it. What would you use it for?

How does it differ from the internal relational engine used in Andl or Rel or SIRA_PRISE or RAQUEL or Duro, etc.?

I thought the 'it' you didn't get was how it works. This is a different 'it'.

I wrote unclearly. I should have written, "I'm afraid I still don't get it, and what would you use it for, anyway?"

I presume the "heading" manipulations are kludge to get around having to define static tuple types using classes?

Quite obviously, the internal engine is much the same. The difference is that the D is a modern industrial strength general purpose programming language. If you want an Industrial D, this is how you get it.

In Rel -- and I assume SIRA_PRISE, RAQUEL, Duro, and probably others, and I'll get to Andl below -- exactly what you've done already exists. In some cases, it has existed for decades. Every one of these D languages is implemented using a popular general-purpose language. In each, the general-purpose language has been used to construct an implementation of the relational model notionally akin to yours.

As such, you're not the first to make an Industrial D, you're simply the latest to make a relational model implementation in a general-purpose language. Unsurprisingly, it exhibits typical quirks imposed by limitations of their host language, like certain sanity/type checking that can only occur at runtime, or code that is awkward and requires careful (and/or tedious) manual construction or a separate code generator.

It's not ready to be used yet, and I don't have anything I want to use it for. Maybe later. At present this is an investigative project but even at this stage I will say this: the first check-in on this project was 3rd May. In just over a month of part-time fiddling I have a D language that runs queries at least on a par with Andl (or Rel). The big gaps are a pre-processor (to enforce the TTM Pro requirements) and a storage engine (to implement the remainder of TTM).

Another thing that isn't clear to me is why you don't have a core relational model implementation already as a result of writing Andl.

Don't you?

Or is Andl essentially an Andl-to-SQL transpiler, in which case there wouldn't necessarily be an implementation of the relational model per se?

If so, then I can see why this is new territory for you. Welcome to the club of those who have implemented the relational model in a general purpose programming language.

You're not the first to do this in a general-purpose language; some of us have been doing this for decades. We may not claim the combination of a general-purpose programming language plus our core relational libraries are Industrial Ds (because they're not) but they belong to exactly the same family of implementation that you're working on now. I bet every one of them could be used stand-alone (Rel's core certainly can), but -- as you've discovered -- you wouldn't really want to, at least not without a pre-processor and a lot of coding care.

It looks like the only difference between what you've done and what we've done (aside from us having storage engines) is that we've already written the "pre-processor" you mention above, and our "pre-processors" take input in the form of SIRA_PRISE, Rel's dialect of Tutorial D, RAQUEL, Duro's dialect of Tutorial D, and so on, and are sophisticated enough to not have to use the host language. But then some do expose and use the host language, like Dan Muller's CsiDB (C++), DEE (Python), and TCLRAL (TCL).

Thus, what will make your implementation interesting is not the core library -- which seems quite typically abominable, but no more or less so than implemented-in-popular-general-purpose-language relational model implementations ever are.

What will make your implementation interesting is the input to your code generator, and how clean and elegant the resulting C# can be when you use it.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from AntC on June 7, 2020, 7:11 am
Quote from dandl on June 5, 2020, 6:03 am

...

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

Then this sounds something like Tropashko's approach: all operators take only relations as operands. If your operator needs to manipulate attribute names, give it an operand with the needed attribute names that is an empty relation (or an operand for which the operator will ignore the tuple content). The primitive operators include enough to give the union of two headings; their intersection; their difference; oh, and relation literals if you want an attribute name constant.

You can do it that way, but it seems wasteful to call up a relation when all you need is the heading. You can do REMOVE, but you'll need a relation that serves no purpose other than to provide a set of attribute names to leave out. Seems kind of silly. For me, it's enough just to call up the heading without the relation behind it.

The syntax could be simplified a lot if all the operators were put into the same consistent form

Giving expressions in 'consistent form' would be something like using only the A primitive operators. Some people like (what we might call) reduced instruction set languages; some don't. I suspect it would defeat the Tutorial purpose of Tutorial D.

No, that's what you get if the aim is a minimal operator set. My aim is to include redundant operators and even shorthands, but present them all in the same form. TD is a pea soup of syntactical forms; it offends me.

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

Yes I dislike the 'invisible operator' between S and juxtaposed { ... }. I particularly dislike the even more invisible operator when you discover {ALL BUT ...}.

Pointing this out, though, seems very much "about the syntax of one language". So you've failed to make your point a) above.

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

All you seem to be saying is you prefer C# style over Tutorial D style. Since I'm not familiar with C# (I can see it doesn't look like Haskell), I'm pretty meh on the whole topic. De gustibus non disputandem. I'm still only seeing bikeshedding about syntax.

No, everything I presented down to here is written in a language-neutral style: function with parenthesised arguments. At this point I showed how to transform it into a 'pipeline' or 'fluent' interface style, which then flows into C# (or Java) but I could equally well have presented it in a Haskell style. I just didn't think of it. The idea is the same across all styles.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

 

If you're trying to say something about how TTM typing/semantics fits into an existing programming language like C#/Java, I'd expect this thread to be talking about types/semantics. I see only syntax. Elsewhere in the thread you've giving examples in C#; I'm not going to learn a language just in case you're saying something radical/different in it (highly unlikely, on your record so far). So you've failed to demonstrate your point b) above.

Since you don't give the semantics that makes anything "all much the same", you've failed to demonstrate your point c).

I think I did all that. Scalar and non-scalar types are easily implemented as value types. Non-scalar values have a heading, and there are rules of heading inference and compatibility that are ideally enforced at compile time, but in any case are required at runtime. RA operators need to be able to access attribute values by heading and construct new values. What else is there to know?

 

Andl - A New Database Language - andl.org
Quote from Dave Voorhis on June 7, 2020, 9:18 am
Quote from dandl on June 7, 2020, 3:31 am

I'm afraid I still don't get it. What would you use it for?

How does it differ from the internal relational engine used in Andl or Rel or SIRA_PRISE or RAQUEL or Duro, etc.?

I thought the 'it' you didn't get was how it works. This is a different 'it'.

I wrote unclearly. I should have written, "I'm afraid I still don't get it, and what would you use it for, anyway?"

I presume the "heading" manipulations are kludge to get around having to define static tuple types using classes?

No, not a kludge, a unifying principle across operators that need types and those that don't. What is the type for the attribute(s) removed by projection? Why should that be a type? I say no, it's a heading.

Quite obviously, the internal engine is much the same. The difference is that the D is a modern industrial strength general purpose programming language. If you want an Industrial D, this is how you get it.

In Rel -- and I assume SIRA_PRISE, RAQUEL, Duro, and probably others, and I'll get to Andl below -- exactly what you've done already exists. In some cases, it has existed for decades. Every one of these D languages is implemented using a popular general-purpose language. In each, the general-purpose language has been used to construct an implementation of the relational model notionally akin to yours.

There is a world of difference between (a) novel language that targets a virtual machine and (b) a language extension that generates code in the base language and (c) a language library. Andl is in category (a), my 'C# as D' project is in category (c). Andl is 'implemented using' C#; this project is C#. I know you know all this, so I see no point in trying to explain further. Why do you say they are the same.?

As such, you're not the first to make an Industrial D, you're simply the latest to make a relational model implementation in a general-purpose language. Unsurprisingly, it exhibits typical quirks imposed by limitations of their host language, like certain sanity/type checking that can only occur at runtime, or code that is awkward and requires careful (and/or tedious) manual construction or a separate code generator.

No, there is no Industrial D because none of the candidates are good enough as languages. C++, Java, C# and a few other strongly typed languages have proved their worth as languages and have the robust supporting infrastructure. An Industrial D has to be at least as good as any of those. Do you claim that Rel is or ever could qualify?

It's not ready to be used yet, and I don't have anything I want to use it for. Maybe later. At present this is an investigative project but even at this stage I will say this: the first check-in on this project was 3rd May. In just over a month of part-time fiddling I have a D language that runs queries at least on a par with Andl (or Rel). The big gaps are a pre-processor (to enforce the TTM Pro requirements) and a storage engine (to implement the remainder of TTM).

Another thing that isn't clear to me is why you don't have a core relational model implementation already as a result of writing Andl.

I told you: I do, but it rests on a custom-built type system. The whole point of this exercise is to show that this not necessary, the type system of a regular GP language is quite good enough.

Don't you?

Or is Andl essentially an Andl-to-SQL transpiler, in which case there wouldn't necessarily be an implementation of the relational model per se?

You know that is not so. You inspected the source code of an early release of Andl, and from the specific critical comments you made, you knew the answer then even if you've forgotten it now.

If so, then I can see why this is new territory for you. Welcome to the club of those who have implemented the relational model in a general purpose programming language.

You're not the first to do this in a general-purpose language; some of us have been doing this for decades. We may not claim the combination of a general-purpose programming language plus our core relational libraries are Industrial Ds (because they're not) but they belong to exactly the same family of implementation that you're working on now. I bet every one of them could be used stand-alone (Rel's core certainly can), but -- as you've discovered -- you wouldn't really want to, at least not without a pre-processor and a lot of coding care.

It looks like the only difference between what you've done and what we've done (aside from us having storage engines) is that we've already written the "pre-processor" you mention above, and our "pre-processors" take input in the form of SIRA_PRISE, Rel's dialect of Tutorial D, RAQUEL, Duro's dialect of Tutorial D, and so on, and are sophisticated enough to not have to use the host language. But then some do expose and use the host language, like Dan Muller's CsiDB (C++), DEE (Python), and TCLRAL (TCL).

Thus, what will make your implementation interesting is not the core library -- which seems quite typically abominable, but no more or less so than implemented-in-popular-general-purpose-language relational model implementations ever are.

What will make your implementation interesting is the input to your code generator, and how clean and elegant the resulting C# can be when you use it.

You've already seen the generated C#, near enough. It's plain simple readable, debuggable C# code, exactly as it was written. Full access to all the libraries, language features, tools and documentation exactly as per any other C# program. You may not choose to write programs in this language, but thousands (millions?) of others do. How many choose to write programs in any of the other languages you mentioned? How many native libraries, native debuggers, native tools and native documentation sources are there for any of those languages?

The only language extension is: headings. They work fine as strings,  but they would work better if they were written differently and compiled into strings. Perhaps they could help with I/O and function calls, but that's about the extent of it. The rest of the language is fine just as is.

Andl - A New Database Language - andl.org
Quote from dandl on June 7, 2020, 11:31 am
Quote from Dave Voorhis on June 7, 2020, 9:18 am
Quote from dandl on June 7, 2020, 3:31 am

 

 

I wrote unclearly. I should have written, "I'm afraid I still don't get it, and what would you use it for, anyway?"

I presume the "heading" manipulations are kludge to get around having to define static tuple types using classes?

No, not a kludge, a unifying principle across operators that need types and those that don't. What is the type for the attribute(s) removed by projection? Why should that be a type? I say no, it's a heading.

To answer your objection to using Tropashko-style operators; and to teach you something about type inference and polymorphism:

The type for attributes removed by projection is the type they're at in the relation they're getting removed from. Just as the type for attributes projected-in is taken from the source relation. Then what you're waffling about being a "heading" as opposed to a relation could be merely a relation with attribute names at polymorphic types.

  • In S REMOVE REL{CITY a}{}, in which REMOVE is a relational operator that takes two relations as operands, returns the first with attributes of the second projected away; a is a polymorphic attribute type that gets unified with whatever type CITY is at in S (CHAR in this case).
  • In S ON REL{S# a, SNAME b, STATUS c}{}, in which ON is a relational operator that takes two relations as operands, returns the first projected on the attributes of the second; a, b, c are polymorphic attribute types that get unified with whatever types S#, SNAME, STATUS are at in S.
  • Writing out a relation literal like that merely to give some attribute names is clunky; I'd expect there to be a shorthand for that; the shorthand might even look like Tutorial D's yeuch.
  • The semantics, though, is that everything is a relation. We don't need some different gizmo with so-far incomprehensible typing or semantics.
  • More realistically, the r.h. operand for those operators would be a variable (relvar or WITH ... definition), could be passed as an argument, returned from a function, etc, etc.

If you don't understand my mention of type unification: it's what the programming language would need anyway to type-check that in a JOIN same-named attributes are at the same type in the operands.

...

I told you: I do, but it rests on a custom-built type system. The whole point of this exercise is to show that this not necessary, the type system of a regular GP language is quite good enough.

You have utterly failed to demonstrate that. If your "headings" are not types, then you can't be using the GP language's type system/type inference to infer types of (say) the result from a JOIN. Then you can't be using the type system to statically type-check the result of a JOIN is correct for assigning the result to some pre-declared relvar. Then whatever you're doing can't be industrial strength.

 

The only language extension is: headings. They work fine as strings,  but they would work better if they were written differently and compiled into strings. Perhaps they could help with I/O and function calls, but that's about the extent of it. The rest of the language is fine just as is.

If anything works fine (we've only your word for it), then as strings they are not reaching type inference. This is the classic problem with passing dynamic SQL as strings from a program to the SQL engine: no type-checking of the request; not even syntax-checking until run-time; no type-checking of the result against the types expected by the calling program. Production applications failing in production with ghastly incomprehensible errors in the face of the users.

That's why industrial-strength SQL applications use stored procedures as far as possible/statically compiled and type-checked against the schema. So does your approach support a stored procedures mechanism?

Quote from dandl on June 7, 2020, 11:31 am
Quote from Dave Voorhis on June 7, 2020, 9:18 am
Quote from dandl on June 7, 2020, 3:31 am

I'm afraid I still don't get it. What would you use it for?

How does it differ from the internal relational engine used in Andl or Rel or SIRA_PRISE or RAQUEL or Duro, etc.?

I thought the 'it' you didn't get was how it works. This is a different 'it'.

I wrote unclearly. I should have written, "I'm afraid I still don't get it, and what would you use it for, anyway?"

I presume the "heading" manipulations are kludge to get around having to define static tuple types using classes?

No, not a kludge, a unifying principle across operators that need types and those that don't. What is the type for the attribute(s) removed by projection? Why should that be a type? I say no, it's a heading.

It's an attribute name list, not a heading. A heading is a set of name/type pairs. Projection (or its REMOVE inverse) only requires an attribute name list.

You'd pass a different construct to RENAME: a set of renaming pairs.

You'd pass a different construct to WHERE: a lambda expression of type boolean.

And so on. There may be good reasons to have a first-class Heading -- to declare it once and define multiple tuples and/or relations from the same Heading, for example -- but what you're passing to RENAME, WHERE, Projection, REMOVE, etc., isn't a Heading.

Quite obviously, the internal engine is much the same. The difference is that the D is a modern industrial strength general purpose programming language. If you want an Industrial D, this is how you get it.

In Rel -- and I assume SIRA_PRISE, RAQUEL, Duro, and probably others, and I'll get to Andl below -- exactly what you've done already exists. In some cases, it has existed for decades. Every one of these D languages is implemented using a popular general-purpose language. In each, the general-purpose language has been used to construct an implementation of the relational model notionally akin to yours.

There is a world of difference between (a) novel language that targets a virtual machine and (b) a language extension that generates code in the base language and (c) a language library.

Why?

I think a relational model library should be universal, not application-specific. It should be the same whether it's used in novel language, or a language extension, or stand-alone.

In other words, it's no different from creating, say, an encryption library or a matrix math library -- you'd use the same encryption library or matrix math library no matter what the application. It's not like matrix multiplication or encryption somehow differs in a language implementation vs, say, a video game.

Likewise, it's the same relational model whether it's a novel language, a language extension, or a stand-alone library.

Andl is in category (a), my 'C# as D' project is in category (c). Andl is 'implemented using' C#; this project is C#. I know you know all this, so I see no point in trying to explain further. Why do you say they are the same.?

I'm not sure why they wouldn't be the same.

As such, you're not the first to make an Industrial D, you're simply the latest to make a relational model implementation in a general-purpose language. Unsurprisingly, it exhibits typical quirks imposed by limitations of their host language, like certain sanity/type checking that can only occur at runtime, or code that is awkward and requires careful (and/or tedious) manual construction or a separate code generator.

No, there is no Industrial D because none of the candidates are good enough as languages. C++, Java, C# and a few other strongly typed languages have proved their worth as languages and have the robust supporting infrastructure. An Industrial D has to be at least as good as any of those. Do you claim that Rel is or ever could qualify?

Absolutely. I mean, it is used for production purposes so in that sense it is an Industrial D. It's certainly more usable and robust than some in-house production systems I've used, and arguably a lot better than some historical horrors that were sold commercially.

But that's not the point. This really depends on your definition of Industrial D, which to my recollection hasn't been formally defined other than to suggest features that it should include, like exception handling and authentication and connection management, any of which (along with other things) could -- along with other desired features -- easily be added to any implementation. You appear to be defining Industrial D around the popularity and richness of a particular syntax and ancillary toolset, which is not really about the language at all but the ecosystem in which the language resides.

It's not ready to be used yet, and I don't have anything I want to use it for. Maybe later. At present this is an investigative project but even at this stage I will say this: the first check-in on this project was 3rd May. In just over a month of part-time fiddling I have a D language that runs queries at least on a par with Andl (or Rel). The big gaps are a pre-processor (to enforce the TTM Pro requirements) and a storage engine (to implement the remainder of TTM).

Another thing that isn't clear to me is why you don't have a core relational model implementation already as a result of writing Andl.

I told you: I do, but it rests on a custom-built type system. The whole point of this exercise is to show that this not necessary, the type system of a regular GP language is quite good enough.

Can't you just unplug the custom type system?

Don't you?

Or is Andl essentially an Andl-to-SQL transpiler, in which case there wouldn't necessarily be an implementation of the relational model per se?

You know that is not so. You inspected the source code of an early release of Andl, and from the specific critical comments you made, you knew the answer then even if you've forgotten it now.

Really?

Now I'm even more baffled.

If so, then I can see why this is new territory for you. Welcome to the club of those who have implemented the relational model in a general purpose programming language.

You're not the first to do this in a general-purpose language; some of us have been doing this for decades. We may not claim the combination of a general-purpose programming language plus our core relational libraries are Industrial Ds (because they're not) but they belong to exactly the same family of implementation that you're working on now. I bet every one of them could be used stand-alone (Rel's core certainly can), but -- as you've discovered -- you wouldn't really want to, at least not without a pre-processor and a lot of coding care.

It looks like the only difference between what you've done and what we've done (aside from us having storage engines) is that we've already written the "pre-processor" you mention above, and our "pre-processors" take input in the form of SIRA_PRISE, Rel's dialect of Tutorial D, RAQUEL, Duro's dialect of Tutorial D, and so on, and are sophisticated enough to not have to use the host language. But then some do expose and use the host language, like Dan Muller's CsiDB (C++), DEE (Python), and TCLRAL (TCL).

Thus, what will make your implementation interesting is not the core library -- which seems quite typically abominable, but no more or less so than implemented-in-popular-general-purpose-language relational model implementations ever are.

What will make your implementation interesting is the input to your code generator, and how clean and elegant the resulting C# can be when you use it.

You've already seen the generated C#, near enough. It's plain simple readable, debuggable C# code, exactly as it was written. ...

You've got things like this:

S.WHERE("CITY,STATUS", (a1,a2) => a1 != "Athens" && a2 >= 20 && a2 <= 30);

Apparently, "CITY,STATUS" is some dynamic construct that has to (presumably) match some elements of S (I guess...) That's not readable, nor robust, nor easily debuggable, nor compile-time checkable.

This would be fine, because the C# compiler would presumably validate it:

S.WHERE((city, status) => city != "Athens" && status >= 20 && status <= 30);

Or this:

S.WHERE(tuple => tuple.city != "Athens" && tuple.status >= 20 && tuple.status <= 30);

But this...

S.WHERE("CITY,STATUS", (a1,a2) => a1 != "Athens" && a2 >= 20 && a2 <= 30);

...and this...

Extend("Qty,ExpQty,ExpQty", TupExtend.F(v => (decimal)v[0] * (decimal)v[1]))

...with its mysterious array references to 0 and 1 (which I presume would break at runtime if I put in the wrong numbers?) just highlights the usual limitations of popular general-purpose programming languages when implementing this kind of relational model.

No matter how much it might in some fashion adhere to the TTM pre/pro-scriptions, I don't think this is what a D was meant to be.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 3 of 5Next