Life after D
Quote from dandl on March 22, 2021, 12:47 amQuote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
The step beyond that is template metaprogramming, which has its own issues as we know from its use in C++.
C++ has templates for a very specific purpose: STL.
C++ templates predated it by a couple of years. Templates appeared in 1990 and STL in 1992, though to be fair it may be a co-fertilised history as Stepanov had apparently been working on notions of generic programming since 1979.
Stroustrup wrote on the philosophy of why templates were added to the language, in terms of the needs that led to STL. Templates came first, so that STL or something like it would be possible.
They allow an algorithm to be specialised across types, but they fall well short of what I had in mind.
And that's essentially a first step to full transpilation, which is needed if you're not just wrapping semantics but shifting to new semantics -- particularly if some of the new semantics are incompatible with the host language semantics. When much of the semantics are different, full compilation may be easier.
It's probably best to start with the ideal language, then work backward from it to a host language to see whether libraries, language-specific facilities like annotations, templates, transpilation, or compilation are most appropriate.
I rewrote the Andl compiler using Pegasus, a PEG parser that emits C# code. It's nice to be able to debug directly into the underlying C#, but the tech required to create a compiler like that is way too hard for most. New languages are tough, and need to be created by degrees, experimentally. Transpilers are tough to write and hard to maintain/modify. We need a better way.
But why not take the core idea of macro expansion and use it to front end Java code? Or has someone already done that?
Simple macro expansion is deprecated -- for reasons we've both mentioned elsewhere (see https://forum.thethirdmanifesto.com/forum/topic/life-after-d-m-for-metaprogramming/) -- but Java code generation is somewhat accepted, to the point that it's an explicit optional build step in the most popular build systems, Maven and Gradle. But it's considered best to avoid it, as it results in brittle, unmaintainable code and it forces two maintenance targets on every project: the code input to the generator, and the code generator itself. (Or three maintenance targets, if you count the rest of the application code separately.)
For example, there is Jakarta XML Binding -- aka JAXB -- that generates Java code to automate reading XML files as class instances. It's largely deprecated now.
I'm using a bit of code generation in my work-in-progress Wrapd "SQL amplifier" library that makes it easier to integrate SQL DBMS querying into Java in general and Java Streams in particular.
With both JAXB and Wrapd, the goal is to make it easier to use Java with existing external data format definitions. The goal is not to replace Java programming with writing some simplified non-Java.
I agree. Code generation is OK if it's a complete solution (like Pegasus or cfront) but code generation front-ending a compiler is truly bad. It means the macros cannot have even the power of C++ templates, because they cannot see the types as the compiler does, so they cannot do type-sensitive expansion.
Indeed, I think needing pure macro expansion to replace/generate Java either indicates a serious limitation of Java (which, in terms of implementing TTM type-system semantics, it has), and/or it indicates some serious limitation of the developer for whom it is intended. In either case, I'm not sure a macro system is the solution.
I disagree. Generics/templates solve the problem of specialising algorithms across types, which the native language cannot do. I want to expand code, not just specialise, and that need something of quite different quality than a mere language feature.
Just to be clear: adding a TTM-like RA to Java should be an acid test of this approach. Java is well able to express the necessary logic, but at the cost of much boiler-plate and runtime type checking. A good enough meta-programming front-end should do both. But right now I don't know how to do that.
It's notionally straightforward to generate code, but I can't see how it solves the usual structural vs nominal typing issues for tuple and relation types.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
For that, I think a transpiler or even a compiler may be more appropriate. Of course, that brings up all the usual issues of wanting/needing to create a hefty toolchain including debugger support, IDE plugins, etc., to make it of any production use.
The toolchain applies. The critical parts are:
- editor and debugger: the user needs to be able to see the expanded text as and when needed
- debugger should see both levels
- expansion error messages should be highly informative.
As an aside, a few years ago I wrote (for use as examples in a class I was teaching) a to-Java transpiler for a simple JavaScript-like language, with semantics not found in Java (dynamically typed with nestable functions) to prevent it simply being syntax translation. The effort was rather non-trivial, at least for a small, over-several-evenings project. Transpiling to languages of notionally-equivalent "level" -- 3GL to 3GL -- but different semantics is hard, fraught with gotchas, and is somewhat toilsome. Compiling down a "level" -- 3GL to 1 or 2GL -- is arguably easier. Aside over.
Perhaps a more reasonable alternative, at least in terms of the TTM type-system semantics, is to dispense with the TTM type-system and work within the semantic limitations of Java (or whatever host language.) That's essentially what I've done / am doing with my Wrapd library.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
What is missing is
Quote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
The step beyond that is template metaprogramming, which has its own issues as we know from its use in C++.
C++ has templates for a very specific purpose: STL.
C++ templates predated it by a couple of years. Templates appeared in 1990 and STL in 1992, though to be fair it may be a co-fertilised history as Stepanov had apparently been working on notions of generic programming since 1979.
Stroustrup wrote on the philosophy of why templates were added to the language, in terms of the needs that led to STL. Templates came first, so that STL or something like it would be possible.
They allow an algorithm to be specialised across types, but they fall well short of what I had in mind.
And that's essentially a first step to full transpilation, which is needed if you're not just wrapping semantics but shifting to new semantics -- particularly if some of the new semantics are incompatible with the host language semantics. When much of the semantics are different, full compilation may be easier.
It's probably best to start with the ideal language, then work backward from it to a host language to see whether libraries, language-specific facilities like annotations, templates, transpilation, or compilation are most appropriate.
I rewrote the Andl compiler using Pegasus, a PEG parser that emits C# code. It's nice to be able to debug directly into the underlying C#, but the tech required to create a compiler like that is way too hard for most. New languages are tough, and need to be created by degrees, experimentally. Transpilers are tough to write and hard to maintain/modify. We need a better way.
But why not take the core idea of macro expansion and use it to front end Java code? Or has someone already done that?
Simple macro expansion is deprecated -- for reasons we've both mentioned elsewhere (see https://forum.thethirdmanifesto.com/forum/topic/life-after-d-m-for-metaprogramming/) -- but Java code generation is somewhat accepted, to the point that it's an explicit optional build step in the most popular build systems, Maven and Gradle. But it's considered best to avoid it, as it results in brittle, unmaintainable code and it forces two maintenance targets on every project: the code input to the generator, and the code generator itself. (Or three maintenance targets, if you count the rest of the application code separately.)
For example, there is Jakarta XML Binding -- aka JAXB -- that generates Java code to automate reading XML files as class instances. It's largely deprecated now.
I'm using a bit of code generation in my work-in-progress Wrapd "SQL amplifier" library that makes it easier to integrate SQL DBMS querying into Java in general and Java Streams in particular.
With both JAXB and Wrapd, the goal is to make it easier to use Java with existing external data format definitions. The goal is not to replace Java programming with writing some simplified non-Java.
I agree. Code generation is OK if it's a complete solution (like Pegasus or cfront) but code generation front-ending a compiler is truly bad. It means the macros cannot have even the power of C++ templates, because they cannot see the types as the compiler does, so they cannot do type-sensitive expansion.
Indeed, I think needing pure macro expansion to replace/generate Java either indicates a serious limitation of Java (which, in terms of implementing TTM type-system semantics, it has), and/or it indicates some serious limitation of the developer for whom it is intended. In either case, I'm not sure a macro system is the solution.
I disagree. Generics/templates solve the problem of specialising algorithms across types, which the native language cannot do. I want to expand code, not just specialise, and that need something of quite different quality than a mere language feature.
Just to be clear: adding a TTM-like RA to Java should be an acid test of this approach. Java is well able to express the necessary logic, but at the cost of much boiler-plate and runtime type checking. A good enough meta-programming front-end should do both. But right now I don't know how to do that.
It's notionally straightforward to generate code, but I can't see how it solves the usual structural vs nominal typing issues for tuple and relation types.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
For that, I think a transpiler or even a compiler may be more appropriate. Of course, that brings up all the usual issues of wanting/needing to create a hefty toolchain including debugger support, IDE plugins, etc., to make it of any production use.
The toolchain applies. The critical parts are:
- editor and debugger: the user needs to be able to see the expanded text as and when needed
- debugger should see both levels
- expansion error messages should be highly informative.
As an aside, a few years ago I wrote (for use as examples in a class I was teaching) a to-Java transpiler for a simple JavaScript-like language, with semantics not found in Java (dynamically typed with nestable functions) to prevent it simply being syntax translation. The effort was rather non-trivial, at least for a small, over-several-evenings project. Transpiling to languages of notionally-equivalent "level" -- 3GL to 3GL -- but different semantics is hard, fraught with gotchas, and is somewhat toilsome. Compiling down a "level" -- 3GL to 1 or 2GL -- is arguably easier. Aside over.
Perhaps a more reasonable alternative, at least in terms of the TTM type-system semantics, is to dispense with the TTM type-system and work within the semantic limitations of Java (or whatever host language.) That's essentially what I've done / am doing with my Wrapd library.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
What is missing is
Quote from Dave Voorhis on March 22, 2021, 11:40 amQuote from dandl on March 22, 2021, 12:47 amQuote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
Ah! But how do you choose which one to run first, in a project with multiple source files?
Or is it not a separate-compilation -> single-linked-executable sort of thing?
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Abuse of Java annotations to create a macro-like facility is just as bad as macros.
What it should be is an in-language feature. Java records are a good example of something that started out as boilerplate, got "fixed" with Lombok using annotations, and got fixed properly with Java 14+'s record construct.
But annotations are still being (ab)used heavily in (for example) Java Spring -- which is a marmite-like love-it-or-hate-it thing -- but even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Macros can write native code, can't easily read native code, and native code can't see macros at all. That's always an (and not the only) essential problem with macros.
Except in Lisp, where they are part of the language and can write, read, and be manipulated by (any, including other macro) code.
The step beyond that is template metaprogramming, which has its own issues as we know from its use in C++.
C++ has templates for a very specific purpose: STL.
C++ templates predated it by a couple of years. Templates appeared in 1990 and STL in 1992, though to be fair it may be a co-fertilised history as Stepanov had apparently been working on notions of generic programming since 1979.
Stroustrup wrote on the philosophy of why templates were added to the language, in terms of the needs that led to STL. Templates came first, so that STL or something like it would be possible.
They allow an algorithm to be specialised across types, but they fall well short of what I had in mind.
And that's essentially a first step to full transpilation, which is needed if you're not just wrapping semantics but shifting to new semantics -- particularly if some of the new semantics are incompatible with the host language semantics. When much of the semantics are different, full compilation may be easier.
It's probably best to start with the ideal language, then work backward from it to a host language to see whether libraries, language-specific facilities like annotations, templates, transpilation, or compilation are most appropriate.
I rewrote the Andl compiler using Pegasus, a PEG parser that emits C# code. It's nice to be able to debug directly into the underlying C#, but the tech required to create a compiler like that is way too hard for most. New languages are tough, and need to be created by degrees, experimentally. Transpilers are tough to write and hard to maintain/modify. We need a better way.
But why not take the core idea of macro expansion and use it to front end Java code? Or has someone already done that?
Simple macro expansion is deprecated -- for reasons we've both mentioned elsewhere (see https://forum.thethirdmanifesto.com/forum/topic/life-after-d-m-for-metaprogramming/) -- but Java code generation is somewhat accepted, to the point that it's an explicit optional build step in the most popular build systems, Maven and Gradle. But it's considered best to avoid it, as it results in brittle, unmaintainable code and it forces two maintenance targets on every project: the code input to the generator, and the code generator itself. (Or three maintenance targets, if you count the rest of the application code separately.)
For example, there is Jakarta XML Binding -- aka JAXB -- that generates Java code to automate reading XML files as class instances. It's largely deprecated now.
I'm using a bit of code generation in my work-in-progress Wrapd "SQL amplifier" library that makes it easier to integrate SQL DBMS querying into Java in general and Java Streams in particular.
With both JAXB and Wrapd, the goal is to make it easier to use Java with existing external data format definitions. The goal is not to replace Java programming with writing some simplified non-Java.
I agree. Code generation is OK if it's a complete solution (like Pegasus or cfront) but code generation front-ending a compiler is truly bad. It means the macros cannot have even the power of C++ templates, because they cannot see the types as the compiler does, so they cannot do type-sensitive expansion.
Yes.
Indeed, I think needing pure macro expansion to replace/generate Java either indicates a serious limitation of Java (which, in terms of implementing TTM type-system semantics, it has), and/or it indicates some serious limitation of the developer for whom it is intended. In either case, I'm not sure a macro system is the solution.
I disagree. Generics/templates solve the problem of specialising algorithms across types, which the native language cannot do. I want to expand code, not just specialise, and that need something of quite different quality than a mere language feature.
Except... It should be a "mere" language feature. If it's macros, as usual we can easily write native code, we can't easily read native code, and native code can't see macros at all.
Just to be clear: adding a TTM-like RA to Java should be an acid test of this approach. Java is well able to express the necessary logic, but at the cost of much boiler-plate and runtime type checking. A good enough meta-programming front-end should do both. But right now I don't know how to do that.
It's notionally straightforward to generate code, but I can't see how it solves the usual structural vs nominal typing issues for tuple and relation types.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
That really is the main issue, and merely generating some code from a macro/annotation/whatever doesn't help with all the native code that needs to obey the same semantics, but doesn't.
For that, I think a transpiler or even a compiler may be more appropriate. Of course, that brings up all the usual issues of wanting/needing to create a hefty toolchain including debugger support, IDE plugins, etc., to make it of any production use.
The toolchain applies. The critical parts are:
- editor and debugger: the user needs to be able to see the expanded text as and when needed
- debugger should see both levels
- expansion error messages should be highly informative.
As an aside, a few years ago I wrote (for use as examples in a class I was teaching) a to-Java transpiler for a simple JavaScript-like language, with semantics not found in Java (dynamically typed with nestable functions) to prevent it simply being syntax translation. The effort was rather non-trivial, at least for a small, over-several-evenings project. Transpiling to languages of notionally-equivalent "level" -- 3GL to 3GL -- but different semantics is hard, fraught with gotchas, and is somewhat toilsome. Compiling down a "level" -- 3GL to 1 or 2GL -- is arguably easier. Aside over.
Perhaps a more reasonable alternative, at least in terms of the TTM type-system semantics, is to dispense with the TTM type-system and work within the semantic limitations of Java (or whatever host language.) That's essentially what I've done / am doing with my Wrapd library.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
What is missing is
?
Quote from dandl on March 22, 2021, 12:47 amQuote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
Ah! But how do you choose which one to run first, in a project with multiple source files?
Or is it not a separate-compilation -> single-linked-executable sort of thing?
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Abuse of Java annotations to create a macro-like facility is just as bad as macros.
What it should be is an in-language feature. Java records are a good example of something that started out as boilerplate, got "fixed" with Lombok using annotations, and got fixed properly with Java 14+'s record construct.
But annotations are still being (ab)used heavily in (for example) Java Spring -- which is a marmite-like love-it-or-hate-it thing -- but even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Macros can write native code, can't easily read native code, and native code can't see macros at all. That's always an (and not the only) essential problem with macros.
Except in Lisp, where they are part of the language and can write, read, and be manipulated by (any, including other macro) code.
The step beyond that is template metaprogramming, which has its own issues as we know from its use in C++.
C++ has templates for a very specific purpose: STL.
C++ templates predated it by a couple of years. Templates appeared in 1990 and STL in 1992, though to be fair it may be a co-fertilised history as Stepanov had apparently been working on notions of generic programming since 1979.
Stroustrup wrote on the philosophy of why templates were added to the language, in terms of the needs that led to STL. Templates came first, so that STL or something like it would be possible.
They allow an algorithm to be specialised across types, but they fall well short of what I had in mind.
And that's essentially a first step to full transpilation, which is needed if you're not just wrapping semantics but shifting to new semantics -- particularly if some of the new semantics are incompatible with the host language semantics. When much of the semantics are different, full compilation may be easier.
It's probably best to start with the ideal language, then work backward from it to a host language to see whether libraries, language-specific facilities like annotations, templates, transpilation, or compilation are most appropriate.
I rewrote the Andl compiler using Pegasus, a PEG parser that emits C# code. It's nice to be able to debug directly into the underlying C#, but the tech required to create a compiler like that is way too hard for most. New languages are tough, and need to be created by degrees, experimentally. Transpilers are tough to write and hard to maintain/modify. We need a better way.
But why not take the core idea of macro expansion and use it to front end Java code? Or has someone already done that?
Simple macro expansion is deprecated -- for reasons we've both mentioned elsewhere (see https://forum.thethirdmanifesto.com/forum/topic/life-after-d-m-for-metaprogramming/) -- but Java code generation is somewhat accepted, to the point that it's an explicit optional build step in the most popular build systems, Maven and Gradle. But it's considered best to avoid it, as it results in brittle, unmaintainable code and it forces two maintenance targets on every project: the code input to the generator, and the code generator itself. (Or three maintenance targets, if you count the rest of the application code separately.)
For example, there is Jakarta XML Binding -- aka JAXB -- that generates Java code to automate reading XML files as class instances. It's largely deprecated now.
I'm using a bit of code generation in my work-in-progress Wrapd "SQL amplifier" library that makes it easier to integrate SQL DBMS querying into Java in general and Java Streams in particular.
With both JAXB and Wrapd, the goal is to make it easier to use Java with existing external data format definitions. The goal is not to replace Java programming with writing some simplified non-Java.
I agree. Code generation is OK if it's a complete solution (like Pegasus or cfront) but code generation front-ending a compiler is truly bad. It means the macros cannot have even the power of C++ templates, because they cannot see the types as the compiler does, so they cannot do type-sensitive expansion.
Yes.
Indeed, I think needing pure macro expansion to replace/generate Java either indicates a serious limitation of Java (which, in terms of implementing TTM type-system semantics, it has), and/or it indicates some serious limitation of the developer for whom it is intended. In either case, I'm not sure a macro system is the solution.
I disagree. Generics/templates solve the problem of specialising algorithms across types, which the native language cannot do. I want to expand code, not just specialise, and that need something of quite different quality than a mere language feature.
Except... It should be a "mere" language feature. If it's macros, as usual we can easily write native code, we can't easily read native code, and native code can't see macros at all.
Just to be clear: adding a TTM-like RA to Java should be an acid test of this approach. Java is well able to express the necessary logic, but at the cost of much boiler-plate and runtime type checking. A good enough meta-programming front-end should do both. But right now I don't know how to do that.
It's notionally straightforward to generate code, but I can't see how it solves the usual structural vs nominal typing issues for tuple and relation types.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
That really is the main issue, and merely generating some code from a macro/annotation/whatever doesn't help with all the native code that needs to obey the same semantics, but doesn't.
For that, I think a transpiler or even a compiler may be more appropriate. Of course, that brings up all the usual issues of wanting/needing to create a hefty toolchain including debugger support, IDE plugins, etc., to make it of any production use.
The toolchain applies. The critical parts are:
- editor and debugger: the user needs to be able to see the expanded text as and when needed
- debugger should see both levels
- expansion error messages should be highly informative.
As an aside, a few years ago I wrote (for use as examples in a class I was teaching) a to-Java transpiler for a simple JavaScript-like language, with semantics not found in Java (dynamically typed with nestable functions) to prevent it simply being syntax translation. The effort was rather non-trivial, at least for a small, over-several-evenings project. Transpiling to languages of notionally-equivalent "level" -- 3GL to 3GL -- but different semantics is hard, fraught with gotchas, and is somewhat toilsome. Compiling down a "level" -- 3GL to 1 or 2GL -- is arguably easier. Aside over.
Perhaps a more reasonable alternative, at least in terms of the TTM type-system semantics, is to dispense with the TTM type-system and work within the semantic limitations of Java (or whatever host language.) That's essentially what I've done / am doing with my Wrapd library.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
What is missing is
?
Quote from Erwin on March 22, 2021, 5:18 pmQuote from Dave Voorhis on March 22, 2021, 11:40 ambut even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Hahah. I have long lost every form of connect with what is being discussed here (if anything significant at all), but this one is something I feel like I can still connect to : Quis custodiet ipsos custodes ? Who orchestrates the orchestrator ? In recollection, being no longer that many years before my final retirement, one of the things I remember to have occasionally observed is this fight between programmers (derogatory sense of the word, so read "codeshitters") and the kind of "true" software architects who regard the user, not the programmer, as the prime stakeholder of any software system, over who should be in control over how the machine behaves. Windows was invented so the operator (i.e. the end user) could be put in control of what happens and what happens not. And from the very moment it started getting traction, programmers (the codeshitters) started to fight for being able to exert control over what the user was allowed to request from the program and what not. And the codeshitters in Redmond ultimately gave in.
This one's a gem, actually. I mean the part I quoted, not my response.
Quote from Dave Voorhis on March 22, 2021, 11:40 ambut even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Hahah. I have long lost every form of connect with what is being discussed here (if anything significant at all), but this one is something I feel like I can still connect to : Quis custodiet ipsos custodes ? Who orchestrates the orchestrator ? In recollection, being no longer that many years before my final retirement, one of the things I remember to have occasionally observed is this fight between programmers (derogatory sense of the word, so read "codeshitters") and the kind of "true" software architects who regard the user, not the programmer, as the prime stakeholder of any software system, over who should be in control over how the machine behaves. Windows was invented so the operator (i.e. the end user) could be put in control of what happens and what happens not. And from the very moment it started getting traction, programmers (the codeshitters) started to fight for being able to exert control over what the user was allowed to request from the program and what not. And the codeshitters in Redmond ultimately gave in.
This one's a gem, actually. I mean the part I quoted, not my response.
Quote from dandl on March 23, 2021, 3:22 amQuote from Dave Voorhis on March 22, 2021, 11:40 amQuote from dandl on March 22, 2021, 12:47 amQuote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
Ah! But how do you choose which one to run first, in a project with multiple source files?
Or is it not a separate-compilation -> single-linked-executable sort of thing?
That's the whole point: every program is separate but they share common includes, modules, files, etc. No linking, just compile and run.
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
Abuse of Java annotations to create a macro-like facility is just as bad as macros.
What it should be is an in-language feature. Java records are a good example of something that started out as boilerplate, got "fixed" with Lombok using annotations, and got fixed properly with Java 14+'s record construct.
But annotations are still being (ab)used heavily in (for example) Java Spring -- which is a marmite-like love-it-or-hate-it thing -- but even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Macros can write native code, can't easily read native code, and native code can't see macros at all. That's always an (and not the only) essential problem with macros.
Except in Lisp, where they are part of the language and can write, read, and be manipulated by (any, including other macro) code.
There is no reason for compiled code to 'see' code at all, other than by reflection. And as I said elsewhere, macros can easily be made to see compiled code via reflection.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
Quote from Dave Voorhis on March 22, 2021, 11:40 amQuote from dandl on March 22, 2021, 12:47 amQuote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
Ah! But how do you choose which one to run first, in a project with multiple source files?
Or is it not a separate-compilation -> single-linked-executable sort of thing?
That's the whole point: every program is separate but they share common includes, modules, files, etc. No linking, just compile and run.
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
Abuse of Java annotations to create a macro-like facility is just as bad as macros.
What it should be is an in-language feature. Java records are a good example of something that started out as boilerplate, got "fixed" with Lombok using annotations, and got fixed properly with Java 14+'s record construct.
But annotations are still being (ab)used heavily in (for example) Java Spring -- which is a marmite-like love-it-or-hate-it thing -- but even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Macros can write native code, can't easily read native code, and native code can't see macros at all. That's always an (and not the only) essential problem with macros.
Except in Lisp, where they are part of the language and can write, read, and be manipulated by (any, including other macro) code.
There is no reason for compiled code to 'see' code at all, other than by reflection. And as I said elsewhere, macros can easily be made to see compiled code via reflection.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
Quote from Dave Voorhis on March 23, 2021, 9:48 amQuote from dandl on March 23, 2021, 3:22 amQuote from Dave Voorhis on March 22, 2021, 11:40 amQuote from dandl on March 22, 2021, 12:47 amQuote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
Ah! But how do you choose which one to run first, in a project with multiple source files?
Or is it not a separate-compilation -> single-linked-executable sort of thing?
That's the whole point: every program is separate but they share common includes, modules, files, etc. No linking, just compile and run.
Ok. The original point of this threadlet was the semantics of entrypoint.
A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.
In Powerflex, the specification of entrypoint in the code is per the user's choice.
So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.
In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)
In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.
What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc. All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.
An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.
You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.
Abuse of Java annotations to create a macro-like facility is just as bad as macros.
What it should be is an in-language feature. Java records are a good example of something that started out as boilerplate, got "fixed" with Lombok using annotations, and got fixed properly with Java 14+'s record construct.
But annotations are still being (ab)used heavily in (for example) Java Spring -- which is a marmite-like love-it-or-hate-it thing -- but even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Macros can write native code, can't easily read native code, and native code can't see macros at all. That's always an (and not the only) essential problem with macros.
Except in Lisp, where they are part of the language and can write, read, and be manipulated by (any, including other macro) code.
There is no reason for compiled code to 'see' code at all, other than by reflection. And as I said elsewhere, macros can easily be made to see compiled code via reflection.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record. That means you're either forced to:
- Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
- Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.
Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
All of our D implementations have the usual set of operations you can use in normal code.
But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.
Quote from dandl on March 23, 2021, 3:22 amQuote from Dave Voorhis on March 22, 2021, 11:40 amQuote from dandl on March 22, 2021, 12:47 amQuote from Dave Voorhis on March 21, 2021, 6:04 pmQuote from dandl on March 19, 2021, 1:24 amQuote from Dave Voorhis on March 18, 2021, 2:39 pmQuote from dandl on March 18, 2021, 5:52 amAgain, the distinction doesn't matter. Is a developer who makes a code generator for himself any different from a developer who makes a code generator for his colleagues? Or for his colleagues in other companies?
Not really.
Totally. It's about specialisation. A tool made to solve an immediate problem (a Bash script to fix text) provides a productivity boost over editing by hand. A tool made to solve a class of problems (Bash itself) provides an enormous boost for many users. Most people can't and don't build tools for repeated use; they depend on those who do.But per that consideration, the Powerflex example isn't particularly clear, either. It doesn't show where the entry point is at all, so is it specified elsewhere?
What entry point? It just executes the code, top to bottom. Everything else is accidental complexity. Run the program, get the answer. Period.
So is every Powerflex program only a single file, like a bash script?
No separate compilation, libraries or modules?
Powerflex has programs, modules and functions, but limited types. Like many 4GLs, forms and database files exist as types within the program. It can call Windows DLLs and COM objects, much like Python.
It has better separate compilation than most languages, as programs can easily be recompiled and replaced in a running system, because compiled programs are separate files.
Cool. So back to the point: how do you specify the entry point to a Powerflex program?
There is an implied outer block comprising all the lines of code not inside a function/procedure.
If there are multiple such files, which one does it run first?
Whichever you choose. The first program is started from a config item, a command line or an icon, then it runs the next and so on. The config item defaults to a supplied menu, but none of th
Ah! But how do you choose which one to run first, in a project with multiple source files?
Or is it not a separate-compilation -> single-linked-executable sort of thing?
That's the whole point: every program is separate but they share common includes, modules, files, etc. No linking, just compile and run.
Ok. The original point of this threadlet was the semantics of entrypoint.
A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.
In Powerflex, the specification of entrypoint in the code is per the user's choice.
So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.
In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)
In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.
What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc. All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.
An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.
You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.
Abuse of Java annotations to create a macro-like facility is just as bad as macros.
What it should be is an in-language feature. Java records are a good example of something that started out as boilerplate, got "fixed" with Lombok using annotations, and got fixed properly with Java 14+'s record construct.
But annotations are still being (ab)used heavily in (for example) Java Spring -- which is a marmite-like love-it-or-hate-it thing -- but even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Macros can write native code, can't easily read native code, and native code can't see macros at all. That's always an (and not the only) essential problem with macros.
Except in Lisp, where they are part of the language and can write, read, and be manipulated by (any, including other macro) code.
There is no reason for compiled code to 'see' code at all, other than by reflection. And as I said elsewhere, macros can easily be made to see compiled code via reflection.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record. That means you're either forced to:
- Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
- Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.
Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
All of our D implementations have the usual set of operations you can use in normal code.
But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.
Quote from dandl on March 23, 2021, 10:24 amQuote from Erwin on March 22, 2021, 5:18 pmQuote from Dave Voorhis on March 22, 2021, 11:40 ambut even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Hahah. I have long lost every form of connect with what is being discussed here (if anything significant at all), but this one is something I feel like I can still connect to : Quis custodiet ipsos custodes ? Who orchestrates the orchestrator ? In recollection, being no longer that many years before my final retirement, one of the things I remember to have occasionally observed is this fight between programmers (derogatory sense of the word, so read "codeshitters") and the kind of "true" software architects who regard the user, not the programmer, as the prime stakeholder of any software system, over who should be in control over how the machine behaves. Windows was invented so the operator (i.e. the end user) could be put in control of what happens and what happens not. And from the very moment it started getting traction, programmers (the codeshitters) started to fight for being able to exert control over what the user was allowed to request from the program and what not. And the codeshitters in Redmond ultimately gave in.
This one's a gem, actually. I mean the part I quoted, not my response.
It's unusual to hear positive things about Microsoft. I've been working with them since before Windows, met BG 4 times, partnered with them in a start-up and been to Redmond twice. I have a high respect for their engineers and their genuine desire to make better products (when marketing doesn't get in the way).
There are some very bad companies out there (Google and Facebook up near the top), but I'm not sure we can blame them for languages that don't work the way we want. I don't know who we should blame exactly, but Richard Stallman is responsible for at least a share of the worst of the worst.
Quote from Erwin on March 22, 2021, 5:18 pmQuote from Dave Voorhis on March 22, 2021, 11:40 ambut even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Hahah. I have long lost every form of connect with what is being discussed here (if anything significant at all), but this one is something I feel like I can still connect to : Quis custodiet ipsos custodes ? Who orchestrates the orchestrator ? In recollection, being no longer that many years before my final retirement, one of the things I remember to have occasionally observed is this fight between programmers (derogatory sense of the word, so read "codeshitters") and the kind of "true" software architects who regard the user, not the programmer, as the prime stakeholder of any software system, over who should be in control over how the machine behaves. Windows was invented so the operator (i.e. the end user) could be put in control of what happens and what happens not. And from the very moment it started getting traction, programmers (the codeshitters) started to fight for being able to exert control over what the user was allowed to request from the program and what not. And the codeshitters in Redmond ultimately gave in.
This one's a gem, actually. I mean the part I quoted, not my response.
It's unusual to hear positive things about Microsoft. I've been working with them since before Windows, met BG 4 times, partnered with them in a start-up and been to Redmond twice. I have a high respect for their engineers and their genuine desire to make better products (when marketing doesn't get in the way).
There are some very bad companies out there (Google and Facebook up near the top), but I'm not sure we can blame them for languages that don't work the way we want. I don't know who we should blame exactly, but Richard Stallman is responsible for at least a share of the worst of the worst.
Quote from Dave Voorhis on March 23, 2021, 10:32 amQuote from dandl on March 23, 2021, 10:24 amQuote from Erwin on March 22, 2021, 5:18 pmQuote from Dave Voorhis on March 22, 2021, 11:40 ambut even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Hahah. I have long lost every form of connect with what is being discussed here (if anything significant at all), but this one is something I feel like I can still connect to : Quis custodiet ipsos custodes ? Who orchestrates the orchestrator ? In recollection, being no longer that many years before my final retirement, one of the things I remember to have occasionally observed is this fight between programmers (derogatory sense of the word, so read "codeshitters") and the kind of "true" software architects who regard the user, not the programmer, as the prime stakeholder of any software system, over who should be in control over how the machine behaves. Windows was invented so the operator (i.e. the end user) could be put in control of what happens and what happens not. And from the very moment it started getting traction, programmers (the codeshitters) started to fight for being able to exert control over what the user was allowed to request from the program and what not. And the codeshitters in Redmond ultimately gave in.
This one's a gem, actually. I mean the part I quoted, not my response.
It's unusual to hear positive things about Microsoft. I've been working with them since before Windows, met BG 4 times, partnered with them in a start-up and been to Redmond twice. I have a high respect for their engineers and their genuine desire to make better products (when marketing doesn't get in the way).
There are some very bad companies out there (Google and Facebook up near the top), but I'm not sure we can blame them for languages that don't work the way we want. I don't know who we should blame exactly, but Richard Stallman is responsible for at least a share of the worst of the worst.
He's responsible for gcc and Emacs. I'd hardly call either of those "the worst of the worst."
You can't blame the GPL for bad languages.
Google and Facebook may be castigated for their moral failings -- and Microsoft too, throughout the 1990s and early 2000s -- but all three of these companies have good engineers and sound processes to weed out bad engineering. All three have allowed bad engineering into the field, but I don't know any software engineering company that hasn't. We all try not to, but it happens -- often we only realise it in retrospect.
So much seems like a good idea at the time, but isn't.
Quote from dandl on March 23, 2021, 10:24 amQuote from Erwin on March 22, 2021, 5:18 pmQuote from Dave Voorhis on March 22, 2021, 11:40 ambut even the Spring / annotation fanbois will freely admit that one of the big limitations of Java annotations is that they can't be orchestrated with normal language code.
Hahah. I have long lost every form of connect with what is being discussed here (if anything significant at all), but this one is something I feel like I can still connect to : Quis custodiet ipsos custodes ? Who orchestrates the orchestrator ? In recollection, being no longer that many years before my final retirement, one of the things I remember to have occasionally observed is this fight between programmers (derogatory sense of the word, so read "codeshitters") and the kind of "true" software architects who regard the user, not the programmer, as the prime stakeholder of any software system, over who should be in control over how the machine behaves. Windows was invented so the operator (i.e. the end user) could be put in control of what happens and what happens not. And from the very moment it started getting traction, programmers (the codeshitters) started to fight for being able to exert control over what the user was allowed to request from the program and what not. And the codeshitters in Redmond ultimately gave in.
This one's a gem, actually. I mean the part I quoted, not my response.
It's unusual to hear positive things about Microsoft. I've been working with them since before Windows, met BG 4 times, partnered with them in a start-up and been to Redmond twice. I have a high respect for their engineers and their genuine desire to make better products (when marketing doesn't get in the way).
There are some very bad companies out there (Google and Facebook up near the top), but I'm not sure we can blame them for languages that don't work the way we want. I don't know who we should blame exactly, but Richard Stallman is responsible for at least a share of the worst of the worst.
He's responsible for gcc and Emacs. I'd hardly call either of those "the worst of the worst."
You can't blame the GPL for bad languages.
Google and Facebook may be castigated for their moral failings -- and Microsoft too, throughout the 1990s and early 2000s -- but all three of these companies have good engineers and sound processes to weed out bad engineering. All three have allowed bad engineering into the field, but I don't know any software engineering company that hasn't. We all try not to, but it happens -- often we only realise it in retrospect.
So much seems like a good idea at the time, but isn't.
Quote from Erwin on March 23, 2021, 7:38 pmQuote from Dave Voorhis on March 23, 2021, 10:32 ambut all three of these companies have good engineers and sound processes to weed out bad engineering.
"have good engineers" is a form of existential quantification and not in any way a proposition to the effect that those "good engineers" are either the majority or else at any rate still a heavyweight force to be seriously considered by those who ***are*** the majority within such companies. Ditto for "have sound processes". To have a process is not a statement to the effect that the results of carrying out that process are ultimately effectively considered when things really get decided. All that duly noted, I'll concede you are right.
Quote from Dave Voorhis on March 23, 2021, 10:32 ambut all three of these companies have good engineers and sound processes to weed out bad engineering.
"have good engineers" is a form of existential quantification and not in any way a proposition to the effect that those "good engineers" are either the majority or else at any rate still a heavyweight force to be seriously considered by those who ***are*** the majority within such companies. Ditto for "have sound processes". To have a process is not a statement to the effect that the results of carrying out that process are ultimately effectively considered when things really get decided. All that duly noted, I'll concede you are right.
Quote from dandl on March 23, 2021, 11:25 pmOk. The original point of this threadlet was the semantics of entrypoint.
Which I answered.
A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.
In Powerflex, the specification of entrypoint in the code is per the user's choice.
Correct.
So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.
In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.
It's not arbitrary, it's whatever the system designer chooses it to be. But what's your point?
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)
In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.
M4 (1977) and ML/1 (1966) are of their time. I don't agree with those descriptions, but I don't care either. They have nothing to tell us.
What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc. All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.
Deprecated by who? Those are all 30+ years old, a product of a different era, no lessons for new work.
An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.
Agreed, except that half the things you can do in C with macros you still can't do in Java. They are things a language should do, but via meta-programming (aka modern macros).
You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.
No, that's not why. It's because they were a product of their time, they worked well for what they did and we didn't know how to do it any better. They can still do things you can't do in any modern language.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record. That means you're either forced to:
- Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
- Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.
Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.
This is wrong. The third option is:
- Represent heading or heading types as generated types and use M compile-time logic to ensure correctness and do heading inference.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
All of our D implementations have the usual set of operations you can use in normal code.
But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.
Using M, safety is guaranteed at compile time.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.
I think you probably want to keep the JVM, IDE and Java interoperability, but not existing code generation tools. IMO a good enough M with editor and debugger support would do everything you ask for.
Ok. The original point of this threadlet was the semantics of entrypoint.
Which I answered.
A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.
In Powerflex, the specification of entrypoint in the code is per the user's choice.
Correct.
So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.
In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.
It's not arbitrary, it's whatever the system designer chooses it to be. But what's your point?
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)
In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.
M4 (1977) and ML/1 (1966) are of their time. I don't agree with those descriptions, but I don't care either. They have nothing to tell us.
What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc. All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.
Deprecated by who? Those are all 30+ years old, a product of a different era, no lessons for new work.
An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.
Agreed, except that half the things you can do in C with macros you still can't do in Java. They are things a language should do, but via meta-programming (aka modern macros).
You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.
No, that's not why. It's because they were a product of their time, they worked well for what they did and we didn't know how to do it any better. They can still do things you can't do in any modern language.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record. That means you're either forced to:
- Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
- Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.
Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.
This is wrong. The third option is:
- Represent heading or heading types as generated types and use M compile-time logic to ensure correctness and do heading inference.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
All of our D implementations have the usual set of operations you can use in normal code.
But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.
Using M, safety is guaranteed at compile time.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.
I think you probably want to keep the JVM, IDE and Java interoperability, but not existing code generation tools. IMO a good enough M with editor and debugger support would do everything you ask for.
Quote from Dave Voorhis on March 24, 2021, 11:53 amQuote from dandl on March 23, 2021, 11:25 pmOk. The original point of this threadlet was the semantics of entrypoint.
Which I answered.
A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.
In Powerflex, the specification of entrypoint in the code is per the user's choice.
Correct.
So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.
In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.
It's not arbitrary, it's whatever the system designer chooses it to be. But what's your point?
My point is in response to your original example of Java and C# being bad (I guess...) because they have explicit entry points, but Powerflex is good because it doesn't (I guess...)
I'm pointing out that you have to express an entry point somewhere, either explicitly and formally in Java and C# -- or implicitly and informally in Powerflex (and languages like PHP or Python) -- so one way or another, the semantics must be expressed somewhere.
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)
In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.
M4 (1977) and ML/1 (1966) are of their time. I don't agree with those descriptions, but I don't care either. They have nothing to tell us.
They're examples where macro systems are successful, still in use, and appropriate.
What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc. All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.
Deprecated by who? Those are all 30+ years old, a product of a different era, no lessons for new work.
Deprecated almost universally by language users and language designers. Ask around.
An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.
Agreed, except that half the things you can do in C with macros you still can't do in Java. They are things a language should do, but via meta-programming (aka modern macros).
The things you can do in C with pre-processor macros are things you almost certainly don't want to do in Java.
E.g., conditional compilation is a typical example: compile this code for this environment but not that environment. The problems are that (a) encoding system dependencies is almost categorically bad in Java; and (b) serializable instances may be moved over the wire from system to system, so conditional checks must be run-time.
In short, the problem that conditional compilation via macros appears to solve actually makes other issues much, much worse. Better not to have macros, and do it properly in code.
You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.
No, that's not why. It's because they were a product of their time, they worked well for what they did and we didn't know how to do it any better. They can still do things you can't do in any modern language.
Pure text expansion -- which is what macros are -- have been entirely replaced in modern languages with template metaprogramming, higher-order functions, stronger and more expressive type systems, and so on. There isn't anything you can do with in-language macros that don't result in more problems than they solve.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record. That means you're either forced to:
- Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
- Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.
Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.
This is wrong. The third option is:
- Represent heading or heading types as generated types and use M compile-time logic to ensure correctness and do heading inference.
If M replaces all Java code, then sure. If M is generating some code and you use Java for the rest, then the unsafety remains fully exposed.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
All of our D implementations have the usual set of operations you can use in normal code.
But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.
Using M, safety is guaranteed at compile time.
Safety is only guaranteed if M replaces using another language. If it's intended that you code in Java and use M to write some code for you, then safety is only "guaranteed" for the M generated code. Safety is still trivially bypassed, and thus isn't safety.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.
I think you probably want to keep the JVM, IDE and Java interoperability, but not existing code generation tools. IMO a good enough M with editor and debugger support would do everything you ask for.
Just JVM interoperability is plenty. IDE integration is nice, but not necessary and can always be added later.
It isn't wholly clear what 'M' is, but it looks to me like it generates some definitions and expressions, leaving the majority of programming to be done in Java (or whatever host language), yes?
If so, it leaves the fundamental unsafety exposed. Mere text expansion -- presumably saving a few keystrokes -- is not very helpful.
Indeed, it's tantamount to the arguments I sometimes see in favour of dynamic typing, which focus on how much time is saved not having to hit as many keyboard keys, but ignore the time lost to dealing with code safety, readability, and maintainability issues.
Quote from dandl on March 23, 2021, 11:25 pmOk. The original point of this threadlet was the semantics of entrypoint.
Which I answered.
A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.
In Powerflex, the specification of entrypoint in the code is per the user's choice.
Correct.
So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.
In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.
It's not arbitrary, it's whatever the system designer chooses it to be. But what's your point?
My point is in response to your original example of Java and C# being bad (I guess...) because they have explicit entry points, but Powerflex is good because it doesn't (I guess...)
I'm pointing out that you have to express an entry point somewhere, either explicitly and formally in Java and C# -- or implicitly and informally in Powerflex (and languages like PHP or Python) -- so one way or another, the semantics must be expressed somewhere.
As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.
Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.
They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)
It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?
As far as macros go, nothing is good about them or hundreds of lines of boilerplate.
Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]
There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)
In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.
M4 (1977) and ML/1 (1966) are of their time. I don't agree with those descriptions, but I don't care either. They have nothing to tell us.
They're examples where macro systems are successful, still in use, and appropriate.
What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc. All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.
Deprecated by who? Those are all 30+ years old, a product of a different era, no lessons for new work.
Deprecated almost universally by language users and language designers. Ask around.
An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.
Agreed, except that half the things you can do in C with macros you still can't do in Java. They are things a language should do, but via meta-programming (aka modern macros).
The things you can do in C with pre-processor macros are things you almost certainly don't want to do in Java.
E.g., conditional compilation is a typical example: compile this code for this environment but not that environment. The problems are that (a) encoding system dependencies is almost categorically bad in Java; and (b) serializable instances may be moved over the wire from system to system, so conditional checks must be run-time.
In short, the problem that conditional compilation via macros appears to solve actually makes other issues much, much worse. Better not to have macros, and do it properly in code.
You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.
No, that's not why. It's because they were a product of their time, they worked well for what they did and we didn't know how to do it any better. They can still do things you can't do in any modern language.
Pure text expansion -- which is what macros are -- have been entirely replaced in modern languages with template metaprogramming, higher-order functions, stronger and more expressive type systems, and so on. There isn't anything you can do with in-language macros that don't result in more problems than they solve.
That's not how to see the problem. So follow these steps:
- Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
- Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
- Create a macro language to that generally. Voila: M!
How does that solve the structural vs nominal typing issue?
Why would you want to? Why not just solve the problem that made you think that was part of the solution?
It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record. That means you're either forced to:
- Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
- Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.
Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.
This is wrong. The third option is:
- Represent heading or heading types as generated types and use M compile-time logic to ensure correctness and do heading inference.
If M replaces all Java code, then sure. If M is generating some code and you use Java for the rest, then the unsafety remains fully exposed.
I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.
Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.
I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?
All of our D implementations have the usual set of operations you can use in normal code.
But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.
Using M, safety is guaranteed at compile time.
Safety is only guaranteed if M replaces using another language. If it's intended that you code in Java and use M to write some code for you, then safety is only "guaranteed" for the M generated code. Safety is still trivially bypassed, and thus isn't safety.
But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creating a D should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.
So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.
No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.
I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.
Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.
I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.
I think you probably want to keep the JVM, IDE and Java interoperability, but not existing code generation tools. IMO a good enough M with editor and debugger support would do everything you ask for.
Just JVM interoperability is plenty. IDE integration is nice, but not necessary and can always be added later.
It isn't wholly clear what 'M' is, but it looks to me like it generates some definitions and expressions, leaving the majority of programming to be done in Java (or whatever host language), yes?
If so, it leaves the fundamental unsafety exposed. Mere text expansion -- presumably saving a few keystrokes -- is not very helpful.
Indeed, it's tantamount to the arguments I sometimes see in favour of dynamic typing, which focus on how much time is saved not having to hit as many keyboard keys, but ignore the time lost to dealing with code safety, readability, and maintainability issues.