The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Life after D

Perhaps somewhat relevant to this discussion, "You are a program synthesizer" http://www.pathsensitive.com/2018/12/my-strange-loop-talk-you-are-program.html

Quote from tobega on March 24, 2021, 5:01 pm

Perhaps somewhat relevant to this discussion, "You are a program synthesizer" http://www.pathsensitive.com/2018/12/my-strange-loop-talk-you-are-program.html

Cool. Thanks for posting this.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on March 24, 2021, 11:53 am
Quote from dandl on March 23, 2021, 11:25 pm

Ok. The original point of this threadlet was the semantics of entrypoint.

Which I answered.

A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.

In Powerflex, the specification of entrypoint in the code is per the user's choice.

Correct.

So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.

In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.

It's not arbitrary, it's whatever the system designer chooses it to be. But what's your point?

My point is in response to your original example of Java and C# being bad (I guess...) because they have explicit entry points, but Powerflex is good because it doesn't (I guess...)

I'm pointing out that you have to express an entry point somewhere, either explicitly and formally in Java and C# -- or implicitly and informally in Powerflex (and languages like PHP or Python) -- so one way or another, the semantics must be expressed somewhere.

Not bad, just short of choices. All programs run inside some kind of framework, and there is always somebody else's code that runs before yours. Java provides a single option for user console programs, inherited from C, and the Java framework forces you to follow that convention, at compile time. Powerflex provides multiple options for how programs get launched, mostly configurable at runtime because the framework is configurable and modifiable.

Event driven programs tend to have multiple entry points, with control returning to the framework in between. Java and Powerflex can do that too.

As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.

Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.

They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)

It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?

As far as macros go, nothing is good about them or hundreds of lines of boilerplate.

Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]

There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)

In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.

M4 (1977) and ML/1 (1966) are of their time. I don't agree with those descriptions, but I don't care either. They have nothing to tell us.

They're examples where macro systems are successful, still in use, and appropriate.

They still have no lessons to guide us what to do today. They are history.

What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc.  All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.

Deprecated by who? Those are all 30+ years old, a product of a different era, no lessons for new work.

Deprecated almost universally by language users and language designers. Ask around.

Why would that be a good idea? Ask around, and you'll find anti-vaxxers and conspiracy theorists. I prefer science, facts and good engineering. You haven't provided any.

An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.

Agreed, except that half the things you can do in C with macros you still can't do in Java. They are things a language should do, but via meta-programming (aka modern macros).

The things you can do in C with pre-processor macros are things you almost certainly don't want to do in Java.

E.g., conditional compilation is a typical example: compile this code for this environment but not that environment. The problems are that (a) encoding system dependencies is almost categorically bad in Java; and (b) serializable instances may be moved over the wire from system to system, so conditional checks must be run-time.

In short, the problem that conditional compilation via macros appears to solve actually makes other issues much, much worse. Better not to have macros, and do it properly in code.

You can't. Looks at any large applications system built in Java and you'll find it wrapped in external tools precisely because of what Java can't do. You admit to using ad hoc code generation and you deplore Swing and its use of annotations, but you provide no alternative. I do.

You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.

No, that's not why. It's because they were a product of their time, they worked well for what they did and we didn't know how to do it any better. They can still do things you can't do in any modern language.

Pure text expansion -- which is what macros are -- have been entirely replaced in modern languages with template metaprogramming, higher-order functions, stronger and more expressive type systems, and so on. There isn't anything you can do with in-language macros that don't result in more problems than they solve.

No, they really aren't. Did you read the Wikipedia article?

That's not how to see the problem. So follow these steps:

  1. Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
  2. Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
  3. Create a macro language to that generally. Voila: M!

How does that solve the structural vs nominal typing issue?

Why would you want to? Why not just solve the problem that made you think that was part of the solution?

It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record.  That means you're either forced to:

  • Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
  • Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.

Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.

This is wrong. The third option is:

  • Represent heading or heading types as generated types and use M compile-time logic to ensure correctness and do heading inference.

If M replaces all Java code, then sure. If M is generating some code and you use Java for the rest, then the unsafety remains fully exposed.

No it isn't. The code that is generated is provably safe.

I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.

Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.

I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?

All of our D implementations have the usual set of operations you can use in normal code.

But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.

Using M, safety is guaranteed at compile time.

Safety is only guaranteed if M replaces using another language. If it's intended that you code in Java and use M to write some code for you, then safety is only "guaranteed" for the M generated code. Safety is still trivially bypassed, and thus isn't safety.

Safety is guaranteed if the code is provably safe.

But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creatingD should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.

So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.

No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have  antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.

I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.

Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.

I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.

I think you probably want to keep the JVM, IDE and Java interoperability, but not existing code generation tools. IMO a good enough M with editor and debugger support would do everything you ask for.

Just JVM interoperability is plenty. IDE integration is nice, but not necessary and can always be added later.

It isn't wholly clear what 'M' is, but it looks to me like it generates some definitions and expressions, leaving the majority of programming to be done in Java (or whatever host language), yes?

If so, it leaves the fundamental unsafety exposed. Mere text expansion -- presumably saving a few keystrokes -- is not very helpful.

Indeed, it's tantamount to the arguments I sometimes see in favour of dynamic typing, which focus on how much time is saved not having to hit as many keyboard keys, but ignore the time lost to dealing with code safety, readability, and maintainability issues.

So as long as M allows programs that are provably type safe, readable and maintainable, you're happy?

 

Andl - A New Database Language - andl.org
Quote from dandl on March 25, 2021, 2:56 am
Quote from Dave Voorhis on March 24, 2021, 11:53 am
Quote from dandl on March 23, 2021, 11:25 pm

Ok. The original point of this threadlet was the semantics of entrypoint.

Which I answered.

A language like C, Java, C#, etc., embeds the specification of entrypoint in the code.

In Powerflex, the specification of entrypoint in the code is per the user's choice.

Correct.

So if there is to be a specified standard application entrypoint ("you need to run this file to start the application") then describing the entrypoint a matter of ad hoc convention in Powerflex, but it's still there.

In short, it's always there, in a sense. It's simply a question of whether it's explicit and formalised, or ad hoc and arbitrary.

It's not arbitrary, it's whatever the system designer chooses it to be. But what's your point?

My point is in response to your original example of Java and C# being bad (I guess...) because they have explicit entry points, but Powerflex is good because it doesn't (I guess...)

I'm pointing out that you have to express an entry point somewhere, either explicitly and formally in Java and C# -- or implicitly and informally in Powerflex (and languages like PHP or Python) -- so one way or another, the semantics must be expressed somewhere.

Not bad, just short of choices. All programs run inside some kind of framework, and there is always somebody else's code that runs before yours. Java provides a single option for user console programs, inherited from C, and the Java framework forces you to follow that convention, at compile time.

No, it doesn't.

The entry point to certain types of Java programs is a static method called main that accepts an array of string arguments. That's by convention, not by requirement. Other kinds of entry points use other mechanisms.

My point is that the common criticism of Java (and C# and C and C++) for having a "too verbose" entry point specification for certain kinds of programs, whilst Python, Ruby, PHP, and apparently Powerflex are "better" because they don't specify a "verbose" entry point in the code is a completely bogus criticism.

The point is simply that an entry point is normally explicit somewhere. If it's not in the language source code, it's in a configuration file or database table or something outside the language source code, and it's always roughly the same general verbosity. Whether the source code effectively says, "this here is the console entry point" or something outside the code says "that there is the console entry point", it's the same thing.

Powerflex provides multiple options for how programs get launched, mostly configurable at runtime because the framework is configurable and modifiable.

Indeed, but it's specified somewhere in both Java and Powerflex, and thus your claim that Powerflex offers some improvement over Java and C# in that respect, is in fact as disingenuous as the usual bogus claims that Python is superior to Java (and C#, C, C++, etc.) because "Hello, World" is a one-liner in Python but a 5-liner in Java.

Event driven programs tend to have multiple entry points, with control returning to the framework in between. Java and Powerflex can do that too.

As I understand it, the essence of annotations (as with C# attributes) is to attach metadata to things for code to interact with at runtime. They're not well suited to what I have in mind.

Indeed, the essence of annotations is supposed to be to attach metadata to code, to help things happen at runtime or at other times. For better or worse (mainly the latter, I'd say) they've grown a bit beyond that.

They can be used to generate code, as a sort-of half-baked meta-language. (Elsewhere, saying so will usually earn me a ranty response from an annotation fanboi pointing out that I'm old, don't know what I'm talking about, annotations are declarative which makes them excellent, and I should retire already and leave proper Java programming to younger and more capable developers, etc. That will inevitably draw responses from the anti-annotation brigade, and before long the forum fire department will need to be called in.)

It's a common story -- nobody takes macros seriously, and everyone has horror stories. So what's so good about writing hundreds of lines of boilerplate, or getting the editor to do it for you?

As far as macros go, nothing is good about them or hundreds of lines of boilerplate.

Why do you keep saying that? Show me any serious attempt at meta-programming/macros in the past 20 years and why 'nothing is good about them'? [Obviously excluding bastardised Java annotations.]

There are plenty of stand alone macro systems -- like M4 and ML/1 -- and they have their uses. A typical good application is turning a purely declarative language -- such as, say, a file of unified configuration settings, or a local search engine setup specification, etc. -- into source code in one or more standard imperative programming languages. They're essentially transpiler generation tools -- what Tanenbaum described as a "poor man's compiler compiler." (See https://ieeexplore.ieee.org/document/1702350)

In other words, using some macro tool to define language x, whose macro expansion produces code in language y -- where x and y are disjunct -- has its uses.

M4 (1977) and ML/1 (1966) are of their time. I don't agree with those descriptions, but I don't care either. They have nothing to tell us.

They're examples where macro systems are successful, still in use, and appropriate.

They still have no lessons to guide us what to do today. They are history.

What we do today is either still use such system where appropriate -- mainly discrete language x to language y purely syntactic translation -- or we've replaced them with better in-language facilities. In-language macro text replacement is questionable, generally because there are much safer, cleaner, more regulated and better-integrated in-language mechanisms like templates, type systems, higher-order constructs like lambdas, etc., that accomplish the same goals without the classic gotchas of generating opaque code that needs to be debugged, or being difficult to reason about, or being able to generate utterly bogus code that emits baffling errors from the generated code rather than the generating code.

What is almost universally deprecated are in-language macro processors, like the C macro processor, the Lisp macro processor (though, arguably, by far the best of a bad lot), PHP-in-HTML, and perhaps the best-known (and rarely mentioned as the macro language that it is / they are) UNIX/Linux shell script languages like bash, ksh, csh, etc.  All of these have their practical applications -- their macro ability being effectively a shortcut -- but in every case, there are ways the problem would have been better solved in-language had a suitable language existed at the time.

Deprecated by who? Those are all 30+ years old, a product of a different era, no lessons for new work.

Deprecated almost universally by language users and language designers. Ask around.

Why would that be a good idea? Ask around, and you'll find anti-vaxxers and conspiracy theorists. I prefer science, facts and good engineering. You haven't provided any.

I assume professional language users and language designers. You won't find many medically-trained anti-vaxxers, or professional historians advocating a broad swathe of conspiratorial nonsense.

I've provided multiple reasons why text-replacement macros are deprecated in favour of in-language mechanisms (like better type systems, higher-order abstractions, etc.) or language-to-language compilation.

Again, text-replacement macros are poorly-integrated (except, arguably, in Lisp), poorly-regulated, and almost invariably do no semantic checking and minimal syntax checking of the source language, which means debugging is purely in the emitted target language. (C++ templates have this problem too, though to a lesser degree.)

Mere macro text expansion is essentially like having an assistant write some code for you, but it leaves all the unsafety intact in the target language. If the text expansion generates all the text for you such that the generated code is essentially inaccessible (or, at least you never need to manipulate it) that's better, but macro expansion is generally insufficient to express it as it uses simple rule-based string-search or AST-based expansion.

It typically has no recognition of source language semantics -- and thus no checking of source language semantics -- beyond whatever is possible for string-search or AST-based expansion. That means all semantic (and often most of the syntactic) error checking comes from compiling the target language.

That means you write in language x, but get almost all your error messages for (generated, and possibly invisible) language y.

Instead, transpilation -- with both syntactic and semantic checking of the source language code -- is preferred, but as a subset of compilation is well beyond the scope of macro systems.

An embedded macro language is always a shortcut workaround for the things your language can't do but should do. C macros are better handled with a mix of language features. PHP-in-HTML is better handled with template languages or HTML-generating UI frameworks. Shell scripts that rely on macro replacement are often better replaced with Python. Lisp is its own special world of weird, where its macro capability is equal parts beautiful and abominable, but that's Lisp as usual.

Agreed, except that half the things you can do in C with macros you still can't do in Java. They are things a language should do, but via meta-programming (aka modern macros).

The things you can do in C with pre-processor macros are things you almost certainly don't want to do in Java.

E.g., conditional compilation is a typical example: compile this code for this environment but not that environment. The problems are that (a) encoding system dependencies is almost categorically bad in Java; and (b) serializable instances may be moved over the wire from system to system, so conditional checks must be run-time.

In short, the problem that conditional compilation via macros appears to solve actually makes other issues much, much worse. Better not to have macros, and do it properly in code.

You can't. Looks at any large applications system built in Java and you'll find it wrapped in external tools precisely because of what Java can't do. You admit to using ad hoc code generation and you deplore Swing and its use of annotations, but you provide no alternative. I do.

Swing doesn't use annotations. Spring uses annotations.

I don't use ad hoc code generation or annotations, I use controlled code generation in a very specific part of the system: to represent static SQL query result set rows as classes or records. That's where limited (and controlled) code generation works fairly well: to create static Java (or other language) code to represent static external data. It's still wholly safe, not because it's generated for you (that isn't safe, actually) but because it still relies on all the static type checking performed by the compiler.

Your alternative appears (so far -- we don't really know much about it) to preserve all the problems of annotations whilst not fixing any of them.

You've already mentioned why in-language macros are deprecated in your assessment of Powerflex. They create an in-language out-of-language half-assed language that's hard to reason about, hard to debug, often hard to use, and is poorly integrated into its own host language.

No, that's not why. It's because they were a product of their time, they worked well for what they did and we didn't know how to do it any better. They can still do things you can't do in any modern language.

Pure text expansion -- which is what macros are -- have been entirely replaced in modern languages with template metaprogramming, higher-order functions, stronger and more expressive type systems, and so on. There isn't anything you can do with in-language macros that don't result in more problems than they solve.

No, they really aren't. Did you read the Wikipedia article?

That's not how to see the problem. So follow these steps:

  1. Write out longhand the Java code that would implement the shorthand version I gave elsewhere. Use your own native types, whatever you like.
  2. Devise a set of macro transformations (textual rewrite rules) that would convert the shorthand into the longhand.
  3. Create a macro language to that generally. Voila: M!

How does that solve the structural vs nominal typing issue?

Why would you want to? Why not just solve the problem that made you think that was part of the solution?

It's a fundamental problem in representing TTM semantics. It's in conflict with the semantics of static structures in every popular programming language, so it precludes sensibly representing a heading or heading type as a typical static struct, class, or record.  That means you're either forced to:

  • Represent heading or heading types as static (declared at compile-time, such as structs, classes or records) nominally-typed structures such that heading (type) x {p int, q char} is a different heading type from y {q char, p int} and z {p int, q char}; or
  • Represent heading or heading types as dynamic structurally-typed structures (declared at runtime, typically values in containers or some generic/parametric hybrid of dynamic and static structures) such that heading (type) x {p int, q char} is the same heading type as y {q char, p int} and z {p int, q char}.

Macro substitution doesn't escape that fundamental issue. At best, it might generate some code for you, but that code is still effectively dynamically typed and as subject as ever to unsafe use in the host language.

This is wrong. The third option is:

  • Represent heading or heading types as generated types and use M compile-time logic to ensure correctness and do heading inference.

If M replaces all Java code, then sure. If M is generating some code and you use Java for the rest, then the unsafety remains fully exposed.

No it isn't. The code that is generated is provably safe.

You mean the code that is generated is safe per the target compiler (including safe in terms of the desired semantics)?

Or safe because the generated code is nominally inaccessible?

Or safe per the code generator but not safe in the target language?

I've already shown how to do that, using headings, generic types and (internally) an array of values rather than a record. The Java code is simple, but verbose. M is intended to fix that.

Again, it only fixes generating some code, but doesn't fix using that code in native code contexts unless you replace all the code with macro expansions. That, of course, is transpiler/compiler territory, which is essentially how all of us D implementers have done it. We've all written something notionally equivalent to your RA implementation, with some run-time-in-the-host-language way to specify headings and relation/tuple types, which becomes compile-time in our D implementations.

I think you're making some unstated assumptions. In my proposal a relation is a Relation<heading>. It has all the usual set of operations (join, where, union, etc) you can use in 'normal' code. The M processor generates boiler-plate code and types, and does the heading inference. It can also import previous definitions via reflection. What do you think is missing?

All of our D implementations have the usual set of operations you can use in normal code.

But "heading" is ultimately dynamically-typed. Safety is determined at application run-time, rather than being guaranteed by the host language compiler. UNION, for example, determines union-compatibility (same headings) at run-time. The C# compiler isn't going to fail to compile because p.union(q) is invalid because p and q have headings that aren't union-compatible.

Using M, safety is guaranteed at compile time.

Safety is only guaranteed if M replaces using another language. If it's intended that you code in Java and use M to write some code for you, then safety is only "guaranteed" for the M generated code. Safety is still trivially bypassed, and thus isn't safety.

Safety is guaranteed if the code is provably safe.

Safety is guaranteed if the generated code is safe per the target compiler, whether it's exposed or not.

Safety is guaranteed if the generated code is not necessarily safe per the target compiler (and the desired semantics), if it's not exposed.

But then, this (part of the) discussion started as a result of specifically not wanting to focus on practical production but instead focus on what the ideal, perhaps purely fantasy general purpose language to support creatingD should be like. Again, I think it would be interesting to focus on the language itself -- this being an opportunity to dispense with both the cruft that catches out beginners, and the cruft that us experienced developers don't like -- and then look at how to implement it. Whether that turns out to be simple macro expansion for Java/C#/whatever or a full-fledged compiler or something else, is something to be determined once the language design is (more) in hand.

So my proposition is that the GP languages are largely complete, give or take a few minor tweaks.They are good enough for any programming task you can throw at them. However they are very complex, quite verbose and easily allow serious bugs, in part because the verbiage gets in the way of understanding what the code really does. And they have very limited metaprogramming. So the goal is shorter-safer-higher by adding meta-programming to a known GP language. And that meets your goal: it can create a D, or something very like it.

No, that only emits some code we could have manually written -- a minimal gain. It doesn't change the semantics of the host language, and indeed I would argue that the most popular languages are nowhere near good enough. They have  antiquated type systems, negligible safety, only nominal typing, only single dispatch, and indeed they are arguably too complex for beginners and non-professionals to grasp.

I don't think mere macro expansion is enough to tackle all that. Indeed, I suspect it adds to the problem rather than offering a solution.

Yes indeed, if you really want those specific things, you need to create a new language (Go, Rust, Swift, Kotlin, Dart, Julia etc seem to have a head start). But if you just want to solve problems and keep using familiar tools, M is the way to go.

I really want those specific things. That's the point. I do not want to keep using "familiar tools", because they're crap.

I think you probably want to keep the JVM, IDE and Java interoperability, but not existing code generation tools. IMO a good enough M with editor and debugger support would do everything you ask for.

Just JVM interoperability is plenty. IDE integration is nice, but not necessary and can always be added later.

It isn't wholly clear what 'M' is, but it looks to me like it generates some definitions and expressions, leaving the majority of programming to be done in Java (or whatever host language), yes?

If so, it leaves the fundamental unsafety exposed. Mere text expansion -- presumably saving a few keystrokes -- is not very helpful.

Indeed, it's tantamount to the arguments I sometimes see in favour of dynamic typing, which focus on how much time is saved not having to hit as many keyboard keys, but ignore the time lost to dealing with code safety, readability, and maintainability issues.

So as long as M allows programs that are provably type safe, readable and maintainable, you're happy?

Sure, but you're not going to be able to achieve those with macro substitution.

As a transpiler, certainly.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

My point is that the common criticism of Java (and C# and C and C++) for having a "too verbose" entry point specification for certain kinds of programs, whilst Python, Ruby, PHP, and apparently Powerflex are "better" because they don't specify a "verbose" entry point in the code is a completely bogus criticism.

The point is simply that an entry point is normally explicit somewhere. If it's not in the language source code, it's in a configuration file or database table or something outside the language source code, and it's always roughly the same general verbosity. Whether the source code effectively says, "this here is the console entry point" or something outside the code says "that there is the console entry point", it's the same thing.

Ok, now I get what you're on about. That wasn't my intention.

I started this out with one-liners intending to illustrate shorter-safer-higher. The whole entry point thing is a diversion.

  • shorter: my samples were 1 line of code instead of 4-5.
  • safer: less lines of code means less places for bugs.
  • higher: leave out the cruft, just focus on the code that solves the problem.

You don't gain anything on the first use: as you point out, the entry point has to be specified somewhere. But you gain on the second use, if you can abstract away the detail. M has a DRY place: Don't Repeat Yourself (with syntactic cruft).

<snip> In-language macro text replacement is questionable, generally because there are much safer, cleaner, more regulated and better-integrated in-language mechanisms like templates, type systems, higher-order constructs like lambdas, etc., that accomplish the same goals without the classic gotchas of generating opaque code that needs to be debugged, or being difficult to reason about, or being able to generate utterly bogus code that emits baffling errors from the generated code rather than the generating code.

I agree: for many of the cases where macros were once used we now have something better. But not all. The main ones that come to mind are:

  • abstracting away syntactic repetition
  • conditional compilation to deal with platform dependencies
  • cross-cutting concerns such as asserts and logging
  • compile-time execution of code such as compile-time asserts, logging, checking/computing literal values, etc.

I've provided multiple reasons why text-replacement macros are deprecated in favour of in-language mechanisms (like better type systems, higher-order abstractions, etc.) or language-to-language compilation.

Actually, you haven't. You've made multiple assertions but given no basis for any of them. So how would those features tackle the list I gave above?

Again, text-replacement macros are poorly-integrated (except, arguably, in Lisp), poorly-regulated, and almost invariably do no semantic checking and minimal syntax checking of the source language, which means debugging is purely in the emitted target language. (C++ templates have this problem too, though to a lesser degree.)

Mere macro text expansion is essentially like having an assistant write some code for you, but it leaves all the unsafety intact in the target language. If the text expansion generates all the text for you such that the generated code is essentially inaccessible (or, at least you never need to manipulate it) that's better, but macro expansion is generally insufficient to express it as it uses simple rule-based string-search or AST-based expansion.

It typically has no recognition of source language semantics -- and thus no checking of source language semantics -- beyond whatever is possible for string-search or AST-based expansion. That means all semantic (and often most of the syntactic) error checking comes from compiling the target language.

So we won't do it like that. Easy done.

That means you write in language x, but get almost all your error messages for (generated, and possibly invisible) language y.

Instead, transpilation -- with both syntactic and semantic checking of the source language code -- is preferred, but as a subset of compilation is well beyond the scope of macro systems.

Actually, M is a transpiler, a source-to-source compiler, just as much as Babel or cfront.

I don't use ad hoc code generation or annotations, I use controlled code generation in a very specific part of the system: to represent static SQL query result set rows as classes or records. That's where limited (and controlled) code generation works fairly well: to create static Java (or other language) code to represent static external data. It's still wholly safe, not because it's generated for you (that isn't safe, actually) but because it still relies on all the static type checking performed by the compiler.

You use code generation because there is no language that can do the job. M can do that job, and is safe for the same reasons.

So as long as M allows programs that are provably type safe, readable and maintainable, you're happy?

Sure, but you're not going to be able to achieve those with macro substitution.

As a transpiler, certainly.

Well I guess that's progress. So, the aim of M is to use compile-time code execution to do the things macros can do and conventional syntactic language features cannot, but safely. One of those things is an implementation of my generic-based TTM-alike, but doing the header checking and inference at compile time instead of at runtime. Another is to generate highly repetitive syntactic structures, like API definitions. Another is to do conditional and cross-cutting logic, but more safely. And overall to make application code shorter-safer-higher.

I need to do some research on newer languages that claim meta-programming features. It's hard to believe no-one has been down this path in the past 20 years, and it will be interesting to see how far they got.

 

Andl - A New Database Language - andl.org
Quote from dandl on March 26, 2021, 12:52 am

My point is that the common criticism of Java (and C# and C and C++) for having a "too verbose" entry point specification for certain kinds of programs, whilst Python, Ruby, PHP, and apparently Powerflex are "better" because they don't specify a "verbose" entry point in the code is a completely bogus criticism.

The point is simply that an entry point is normally explicit somewhere. If it's not in the language source code, it's in a configuration file or database table or something outside the language source code, and it's always roughly the same general verbosity. Whether the source code effectively says, "this here is the console entry point" or something outside the code says "that there is the console entry point", it's the same thing.

Ok, now I get what you're on about. That wasn't my intention.

I started this out with one-liners intending to illustrate shorter-safer-higher. The whole entry point thing is a diversion.

  • shorter: my samples were 1 line of code instead of 4-5.
  • safer: less lines of code means less places for bugs.
  • higher: leave out the cruft, just focus on the code that solves the problem.

You don't gain anything on the first use: as you point out, the entry point has to be specified somewhere. But you gain on the second use, if you can abstract away the detail. M has a DRY place: Don't Repeat Yourself (with syntactic cruft).

<snip> In-language macro text replacement is questionable, generally because there are much safer, cleaner, more regulated and better-integrated in-language mechanisms like templates, type systems, higher-order constructs like lambdas, etc., that accomplish the same goals without the classic gotchas of generating opaque code that needs to be debugged, or being difficult to reason about, or being able to generate utterly bogus code that emits baffling errors from the generated code rather than the generating code.

I agree: for many of the cases where macros were once used we now have something better. But not all. The main ones that come to mind are:

  • abstracting away syntactic repetition

Functions and procedures, ideally.

Templates, less ideally.

If you can't elide syntactic repetition either with (primarily) functions and procedures and (secondarily) concise and clear syntax, then the language itself has limitations. "Fixing" it with macro expansion only adds technical debt; now you have two problems -- the original broken language and a new language inextricably bound to the broken language.

  • conditional compilation to deal with platform dependencies

Generally considered weak, particularly because compiled objects (e.g., Java .class or .jar files) are meant to be platform-neutral static objects which are often exchanged between platforms at runtime.

Thus, conditional compilation is dispensed in favour of runtime checks. Where these have breaking performance impact, you extract the platform differences into their own compile-able units and handle differences in the build system, not in the source code.

  • cross-cutting concerns such as asserts and logging

That's aspect-oriented programming. Macros are a crude solution. See https://en.wikipedia.org/wiki/Aspect-oriented_programming

  • compile-time execution of code such as compile-time asserts, logging, checking/computing literal values, etc.

Should be in-language, to avoid the gotchas inherent in macro expansion.

I've provided multiple reasons why text-replacement macros are deprecated in favour of in-language mechanisms (like better type systems, higher-order abstractions, etc.) or language-to-language compilation.

Actually, you haven't. You've made multiple assertions but given no basis for any of them. So how would those features tackle the list I gave above?

See the answers I gave above.

Again, text-replacement macros are poorly-integrated (except, arguably, in Lisp), poorly-regulated, and almost invariably do no semantic checking and minimal syntax checking of the source language, which means debugging is purely in the emitted target language. (C++ templates have this problem too, though to a lesser degree.)

Mere macro text expansion is essentially like having an assistant write some code for you, but it leaves all the unsafety intact in the target language. If the text expansion generates all the text for you such that the generated code is essentially inaccessible (or, at least you never need to manipulate it) that's better, but macro expansion is generally insufficient to express it as it uses simple rule-based string-search or AST-based expansion.

It typically has no recognition of source language semantics -- and thus no checking of source language semantics -- beyond whatever is possible for string-search or AST-based expansion. That means all semantic (and often most of the syntactic) error checking comes from compiling the target language.

So we won't do it like that. Easy done.

Indeed, "we won't do it like that."

What we do is design of languages with desirable language features that compile (of which transpilation is a subset) to an object language target.

That means you write in language x, but get almost all your error messages for (generated, and possibly invisible) language y.

Instead, transpilation -- with both syntactic and semantic checking of the source language code -- is preferred, but as a subset of compilation is well beyond the scope of macro systems.

Actually, M is a transpiler, a source-to-source compiler, just as much as Babel or cfront.

Is it?

All along, you appear to have been suggesting macro expansion.

Now it's a transpiler like CoffeeScript or TypeScript or LLVM C targets (see https://github.com/JuliaComputing/llvm-cbe)?

That's quite different from a macro system.

I don't use ad hoc code generation or annotations, I use controlled code generation in a very specific part of the system: to represent static SQL query result set rows as classes or records. That's where limited (and controlled) code generation works fairly well: to create static Java (or other language) code to represent static external data. It's still wholly safe, not because it's generated for you (that isn't safe, actually) but because it still relies on all the static type checking performed by the compiler.

You use code generation because there is no language that can do the job. M can do that job, and is safe for the same reasons.

I use code generation because it works neatly, and is wholly integrated into Java whilst nicely integrating and "amplifying" SQL. No new language is needed.

So as long as M allows programs that are provably type safe, readable and maintainable, you're happy?

Sure, but you're not going to be able to achieve those with macro substitution.

As a transpiler, certainly.

Well I guess that's progress. So, the aim of M is to use compile-time code execution to do the things macros can do and conventional syntactic language features cannot, but safely. One of those things is an implementation of my generic-based TTM-alike, but doing the header checking and inference at compile time instead of at runtime. Another is to generate highly repetitive syntactic structures, like API definitions. Another is to do conditional and cross-cutting logic, but more safely. And overall to make application code shorter-safer-higher.

I need to do some research on newer languages that claim meta-programming features. It's hard to believe no-one has been down this path in the past 20 years, and it will be interesting to see how far they got.

Macro-based metaprogramming was pretty much dead-ended in academia about 40 years ago, though perhaps it took a bit longer for industry to catch up. Modern language design focus is categorically not superficially on "shorter-safer-higher", which trivialises real and ongoing efforts to achieve better language expressivity, maintainability, concurrency and safety.

See Kotlin, Coq, Groovy, Julia, Go, Agda, Scala, Rust, Swift, Haskell, Nim, Haxe.

The last two might be distantly close to both what you have in mind, and to what I have in mind. Maybe.

Every one of them explicitly tackles problems of programming in different (and sometimes contradictory) ways with different emphases, but they are all tackling the same fundamental issues.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

M is a meta-programming language, for writing programs that read and write programs. Macros are a subset of M.

MJ is M for Java. I have just learned about the Java annotations processor, which seems to be a clunky subset of MJ. Java has nothing else.

I am highly confident that you cannot carry out the 4 tasks I outlined (and many others) with non-M "languages with desirable language features that compile (of which transpilation is a subset) to an object language target". If you are hostile to M/MJ, you will find workarounds and partial solutions, but I am confident that no matter what you propose, I can produce an example that M/MJ can solve and that what you propose cannot. If you are using any tools of any kind to generate or modify Java source code you are already acknowledging the need for MJ, but those are just the tip of the iceberg. Building serious software needs M.

I intend to follow up by finding others who are successfully using meta-programming and languages that show how it can be done.

Andl - A New Database Language - andl.org
Quote from tobega on March 24, 2021, 5:01 pm

Perhaps somewhat relevant to this discussion, "You are a program synthesizer" http://www.pathsensitive.com/2018/12/my-strange-loop-talk-you-are-program.html

I found this unwatchable so I'll ask here: did he ever say exactly what his day job is? From his web site, it appears he is an academic engaged in research on what others might refer to as code generation and macros. Yes?

Andl - A New Database Language - andl.org
Quote from dandl on March 27, 2021, 9:52 am

M is a meta-programming language, for writing programs that read and write programs. Macros are a subset of M.

MJ is M for Java. I have just learned about the Java annotations processor, which seems to be a clunky subset of MJ. Java has nothing else.

I am highly confident that you cannot carry out the 4 tasks I outlined (and many others) with non-M "languages with desirable language features that compile (of which transpilation is a subset) to an object language target". If you are hostile to M/MJ, you will find workarounds and partial solutions, but I am confident that no matter what you propose, I can produce an example that M/MJ can solve and that what you propose cannot. If you are using any tools of any kind to generate or modify Java source code you are already acknowledging the need for MJ, but those are just the tip of the iceberg. Building serious software needs M.

I intend to follow up by finding others who are successfully using meta-programming and languages that show how it can be done.

I don't think M exists yet, let alone MJ, so it's a bit difficult to evaluate it (except from your arm-waving about it, I guess.)

Using tools to generate (not modify) Java source code is not a need for MJ, but simply a recognition that pre-existing static structures in external definitions -- like XML DTDs or SQL schemas or RESTful API specifications -- may benefit from having static Java class or record analogues, and creating them can be automated.

That isn't a need for a distinct language of any kind; that's simply integration of pre-existing external data definitions into Java. You typically aren't writing the XML DTDs or SQL schemas or RESTful API specifications because they exist already (and exposing Java as a RESTful API is a good and proper use of Java annotations.)

You're merely running this Java program that's pointed to them, and Java code appears over there as if you'd written it by hand. You're still working entirely -- and desirably -- in a Java domain.

Re "successfully using meta-programming and languages that show how it can be done," take a look at Nim (it even has macros!) and Haxe (with macros! https://haxe.org/manual/macro.html)

Though their macros are not text-replacement macros.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on March 27, 2021, 10:21 am
Quote from tobega on March 24, 2021, 5:01 pm

Perhaps somewhat relevant to this discussion, "You are a program synthesizer" http://www.pathsensitive.com/2018/12/my-strange-loop-talk-you-are-program.html

I found this unwatchable

Then scroll down and read the transcript.

so I'll ask here: did he ever say exactly what his day job is? From his web site, it appears he is an academic engaged in research on what others might refer to as code generation and macros. Yes?

Click the link in the bio sidebar to find http://www.jameskoppel.com/

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org