The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Language safety missing when D meets Database

The language D described by TTM is reasonably safe. Although it was not an explicit design goal, the value types and requirement for type inference largely guarantee that if a program compiles it will run. The are still a few gaps, such as conversion and arithmetic errors, infinite loops and infinite recursion which might be worth addressing.

On the other hand the operating environment envisaged by TTM is completely unsafe. There is no guarantee that a program that depends on any database feature will run at all, or without error, or indeed perform any useful function. RM Pre 16,17,23,24,25 presume the existence of one or more external databases, each containing database relvars and a catalog, but set down no guiding principles about how they relate to each other. A database might be changed (somehow) so that a program that used to run no longer does so. Two databases might be incompatible as to entity names or types. A database constraint might be changed so that a database no longer satisfies it. And so on.

This situation, of a program being tightly controlled to satisfy a compiler but then free to fail for trivial reasons related to the runtime environment, is very familiar to all of us. That doesn't mean it's right. The most fundamental question is: you can write the program so it compiles but will it run? TTM has no answer, but then neither does any other language including C/C++, Java/C#, Python, JS, etc.

Contrast this with Smalltalk, Forth, MathCAD, and a very few others where the program you write is immediately part of the running environment and whole classes of failure reasons are simply avoided. The only modern programming language I know of that aspires to something like this is Erlang. Perhaps TTM/D should be one of them.

Andl - A New Database Language - andl.org
Quote from dandl on June 9, 2021, 12:50 am

The language D described by TTM is reasonably safe. Although it was not an explicit design goal, the value types and requirement for type inference largely guarantee that if a program compiles it will run. The are still a few gaps, such as conversion and arithmetic errors, infinite loops and infinite recursion which might be worth addressing.

On the other hand the operating environment envisaged by TTM is completely unsafe. There is no guarantee that a program that depends on any database feature will run at all, or without error, or indeed perform any useful function. RM Pre 16,17,23,24,25 presume the existence of one or more external databases, each containing database relvars and a catalog, but set down no guiding principles about how they relate to each other. A database might be changed (somehow) so that a program that used to run no longer does so. Two databases might be incompatible as to entity names or types. A database constraint might be changed so that a database no longer satisfies it. And so on.

This situation, of a program being tightly controlled to satisfy a compiler but then free to fail for trivial reasons related to the runtime environment, is very familiar to all of us. That doesn't mean it's right. The most fundamental question is: you can write the program so it compiles but will it run? TTM has no answer, but then neither does any other language including C/C++, Java/C#, Python, JS, etc.

Contrast this with Smalltalk, Forth, MathCAD, and a very few others where the program you write is immediately part of the running environment and whole classes of failure reasons are simply avoided. The only modern programming language I know of that aspires to something like this is Erlang. Perhaps TTM/D should be one of them.

Perhaps, though it's arguably out of scope for TTM, which intentionally focuses on certain language fundamentals and excludes anything about the environment in which the language runs, along with no prescriptions for how database connections, exception handling, etc. should work.

Rel is notionally (or at least started out as) a closed environment akin to Smalltalk, where the code is stored in the database and all dependencies are managed to ensure that if it compiles, it runs.

But then to make it practically useful, I extended it to reference external dependencies like SQL DBMSs via JDBC, CSV files and the like. Utility has obviously increased, but I can no longer guarantee that what compiles, runs. I can guarantee that a native Rel relvar can't be dropped until it has no code references, but I can't guarantee that a JDBC connection to an intermittently-available SQL DBMS which worked yesterday will work today. That currently means a run-time failure. What it should do -- but doesn't do, yet -- is the compiler should obligate the developer provide some code path(s) to safely handle the situation where the external dependency has failed.

But per TTM, that's all implementation-dependent.

Though TTM could perhaps be more rigorous about specifying obligatory handling of possible run-time failures in what it does specify. For example, under the IM, TREAT_AS_x(...) operators notionally blow up at runtime in some unspecified way if the cast fails. It should obligate that a code block be provided to handle the situation where the cast fails.

Or, UNION types should obligate that code blocks be provided to handle each option (or a default for any unhandled option) -- where relevant -- of a UNION type.

Of course, a lazy developer can simply leave the obligatory code blocks empty, thus effectively not handling errors, options, you-name-it. But the empty code blocks are explicit and easily identified, and (at least) the program won't crash at run-time.

I think we've discussed all this before, here. Haven't we?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 9, 2021, 8:25 am
Quote from dandl on June 9, 2021, 12:50 am

The language D described by TTM is reasonably safe. Although it was not an explicit design goal, the value types and requirement for type inference largely guarantee that if a program compiles it will run. The are still a few gaps, such as conversion and arithmetic errors, infinite loops and infinite recursion which might be worth addressing.

On the other hand the operating environment envisaged by TTM is completely unsafe. There is no guarantee that a program that depends on any database feature will run at all, or without error, or indeed perform any useful function. RM Pre 16,17,23,24,25 presume the existence of one or more external databases, each containing database relvars and a catalog, but set down no guiding principles about how they relate to each other. A database might be changed (somehow) so that a program that used to run no longer does so. Two databases might be incompatible as to entity names or types. A database constraint might be changed so that a database no longer satisfies it. And so on.

This situation, of a program being tightly controlled to satisfy a compiler but then free to fail for trivial reasons related to the runtime environment, is very familiar to all of us. That doesn't mean it's right. The most fundamental question is: you can write the program so it compiles but will it run? TTM has no answer, but then neither does any other language including C/C++, Java/C#, Python, JS, etc.

Contrast this with Smalltalk, Forth, MathCAD, and a very few others where the program you write is immediately part of the running environment and whole classes of failure reasons are simply avoided. The only modern programming language I know of that aspires to something like this is Erlang. Perhaps TTM/D should be one of them.

Perhaps, though it's arguably out of scope for TTM, which intentionally focuses on certain language fundamentals and excludes anything about the environment in which the language runs, along with no prescriptions for how database connections, exception handling, etc. should work.

My point is that it should not. There is little point banging on about type inference to avoid runtime errors while being silent on catastrophic program failures caused by simply changing the data in the database (or catalog).

Rel is notionally (or at least started out as) a closed environment akin to Smalltalk, where the code is stored in the database and all dependencies are managed to ensure that if it compiles, it runs.

But then to make it practically useful, I extended it to reference external dependencies like SQL DBMSs via JDBC, CSV files and the like. Utility has obviously increased, but I can no longer guarantee that what compiles, runs. I can guarantee that a native Rel relvar can't be dropped until it has no code references, but I can't guarantee that a JDBC connection to an intermittently-available SQL DBMS which worked yesterday will work today. That currently means a run-time failure. What it should do -- but doesn't do, yet -- is the compiler should obligate the developer provide some code path(s) to safely handle the situation where the external dependency has failed.

It could do better than that. It could say: your program has a dependency on X and is no longer runnable. Rather than wait to get to that part of the program and failing, it could capture dependencies up front. Yes, a programmer might make a positive decision to run and handle the consequences, but the default should be to explicitly manage and track dependencies.

But per TTM, that's all implementation-dependent.

Though TTM could perhaps be more rigorous about specifying obligatory handling of possible run-time failures in what it does specify. For example, under the IM, TREAT_AS_x(...) operators notionally blow up at runtime in some unspecified way if the cast fails. It should obligate that a code block be provided to handle the situation where the cast fails.

Or, UNION types should obligate that code blocks be provided to handle each option (or a default for any unhandled option) -- where relevant -- of a UNION type.

Of course, a lazy developer can simply leave the obligatory code blocks empty, thus effectively not handling errors, options, you-name-it. But the empty code blocks are explicit and easily identified, and (at least) the program won't crash at run-time.

I think we've discussed all this before, here. Haven't we?

Not that I recall. This has all been triggered by (a) my recent pondering the idea that the next big thing in languages ought to be safer (from which higher-shorter) and (b) the demoralising experience of trying to get a program to build and run without having to learn deeply about a new environment (api server in dot net core, in case you were wondering). In both cases the underlying problem is dependencies on stuff I can't see and can only be navigated by reading impenetrable documentation or finding someone else with the problem on SO. And then I realised that TTM/D suffers the same problems.

We scoff at C and its 'undefined behaviour' but in truth, what I am experiencing is the epitome of undefined behaviour and I'm over it. Seriously.

Andl - A New Database Language - andl.org
Quote from Dave Voorhis on June 9, 2021, 8:25 am
Quote from dandl on June 9, 2021, 12:50 am

The language D described by TTM is reasonably safe. Although it was not an explicit design goal, the value types and requirement for type inference largely guarantee that if a program compiles it will run. The are still a few gaps, such as conversion and arithmetic errors, infinite loops and infinite recursion which might be worth addressing.

On the other hand the operating environment envisaged by TTM is completely unsafe. There is no guarantee that a program that depends on any database feature will run at all, or without error, or indeed perform any useful function. RM Pre 16,17,23,24,25 presume the existence of one or more external databases, each containing database relvars and a catalog, but set down no guiding principles about how they relate to each other. A database might be changed (somehow) so that a program that used to run no longer does so. Two databases might be incompatible as to entity names or types. A database constraint might be changed so that a database no longer satisfies it. And so on.

This situation, of a program being tightly controlled to satisfy a compiler but then free to fail for trivial reasons related to the runtime environment, is very familiar to all of us. That doesn't mean it's right. The most fundamental question is: you can write the program so it compiles but will it run? TTM has no answer, but then neither does any other language including C/C++, Java/C#, Python, JS, etc.

Contrast this with Smalltalk, Forth, MathCAD, and a very few others where the program you write is immediately part of the running environment and whole classes of failure reasons are simply avoided. The only modern programming language I know of that aspires to something like this is Erlang. Perhaps TTM/D should be one of them.

Perhaps, though it's arguably out of scope for TTM, which intentionally focuses on certain language fundamentals and excludes anything about the environment in which the language runs, along with no prescriptions for how database connections, exception handling, etc. should work.

Rel is notionally (or at least started out as) a closed environment akin to Smalltalk, where the code is stored in the database and all dependencies are managed to ensure that if it compiles, it runs.

But then to make it practically useful, I extended it to reference external dependencies like SQL DBMSs via JDBC, CSV files and the like. Utility has obviously increased, but I can no longer guarantee that what compiles, runs. I can guarantee that a native Rel relvar can't be dropped until it has no code references, but I can't guarantee that a JDBC connection to an intermittently-available SQL DBMS which worked yesterday will work today. That currently means a run-time failure. What it should do -- but doesn't do, yet -- is the compiler should obligate the developer provide some code path(s) to safely handle the situation where the external dependency has failed.

But per TTM, that's all implementation-dependent.

Though TTM could perhaps be more rigorous about specifying obligatory handling of possible run-time failures in what it does specify. For example, under the IM, TREAT_AS_x(...) operators notionally blow up at runtime in some unspecified way if the cast fails. It should obligate that a code block be provided to handle the situation where the cast fails.

Or, UNION types should obligate that code blocks be provided to handle each option (or a default for any unhandled option) -- where relevant -- of a UNION type.

Of course, a lazy developer can simply leave the obligatory code blocks empty, thus effectively not handling errors, options, you-name-it. But the empty code blocks are explicit and easily identified, and (at least) the program won't crash at run-time.

I think we've discussed all this before, here. Haven't we?

My first thought was that Rel probably satisfies the requirement, and Dave appears to agree, but it's way out of scope for TTM.  It could have been in scope for Tutorial D, if we had been a little more confident that implementations would appear, but even then I don't think we would have included it.  It's not needed for the tutorial and illustration purposes that were our primary motivation.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from dandl on June 9, 2021, 1:50 pm
Quote from Dave Voorhis on June 9, 2021, 8:25 am
Quote from dandl on June 9, 2021, 12:50 am

The language D described by TTM is reasonably safe. Although it was not an explicit design goal, the value types and requirement for type inference largely guarantee that if a program compiles it will run. The are still a few gaps, such as conversion and arithmetic errors, infinite loops and infinite recursion which might be worth addressing.

On the other hand the operating environment envisaged by TTM is completely unsafe. There is no guarantee that a program that depends on any database feature will run at all, or without error, or indeed perform any useful function. RM Pre 16,17,23,24,25 presume the existence of one or more external databases, each containing database relvars and a catalog, but set down no guiding principles about how they relate to each other. A database might be changed (somehow) so that a program that used to run no longer does so. Two databases might be incompatible as to entity names or types. A database constraint might be changed so that a database no longer satisfies it. And so on.

This situation, of a program being tightly controlled to satisfy a compiler but then free to fail for trivial reasons related to the runtime environment, is very familiar to all of us. That doesn't mean it's right. The most fundamental question is: you can write the program so it compiles but will it run? TTM has no answer, but then neither does any other language including C/C++, Java/C#, Python, JS, etc.

Contrast this with Smalltalk, Forth, MathCAD, and a very few others where the program you write is immediately part of the running environment and whole classes of failure reasons are simply avoided. The only modern programming language I know of that aspires to something like this is Erlang. Perhaps TTM/D should be one of them.

Perhaps, though it's arguably out of scope for TTM, which intentionally focuses on certain language fundamentals and excludes anything about the environment in which the language runs, along with no prescriptions for how database connections, exception handling, etc. should work.

My point is that it should not. There is little point banging on about type inference to avoid runtime errors while being silent on catastrophic program failures caused by simply changing the data in the database (or catalog).

Perhaps true, but I'm not sure anyone has a definitive solution -- or at least one solution that applies to all cases. Current solutions depend on specific situations.

Rel is notionally (or at least started out as) a closed environment akin to Smalltalk, where the code is stored in the database and all dependencies are managed to ensure that if it compiles, it runs.

But then to make it practically useful, I extended it to reference external dependencies like SQL DBMSs via JDBC, CSV files and the like. Utility has obviously increased, but I can no longer guarantee that what compiles, runs. I can guarantee that a native Rel relvar can't be dropped until it has no code references, but I can't guarantee that a JDBC connection to an intermittently-available SQL DBMS which worked yesterday will work today. That currently means a run-time failure. What it should do -- but doesn't do, yet -- is the compiler should obligate the developer provide some code path(s) to safely handle the situation where the external dependency has failed.

It could do better than that. It could say: your program has a dependency on X and is no longer runnable. Rather than wait to get to that part of the program and failing, it could capture dependencies up front. Yes, a programmer might make a positive decision to run and handle the consequences, but the default should be to explicitly manage and track dependencies.

Rel explicitly manages and tracks dependencies, but its priority is to prevent you from creating a non-runnable state in the first place -- but only for internal dependencies.

For external dependencies, I wouldn't want a single solution. Some external dependency failures are minor and temporary and might only prompt (what becomes) a response message like "The current weather forecast is unavailable. Here's the most recent forecast from <timestamp>."

Some dependency failures might disable specific functionality until code is recompiled/rewritten/whatever, but the rest of the application continues running.

Some dependency failures might warrant immediately shutting down and logging failure after sending a text/email to the senior administrator, etc.

I would want the flexibility to handle each external dependency failure as a (possibly) custom solution, hence my suggestion to obligate 'failure' code paths, via checked exceptions or some similar mechanism.

But per TTM, that's all implementation-dependent.

Though TTM could perhaps be more rigorous about specifying obligatory handling of possible run-time failures in what it does specify. For example, under the IM, TREAT_AS_x(...) operators notionally blow up at runtime in some unspecified way if the cast fails. It should obligate that a code block be provided to handle the situation where the cast fails.

Or, UNION types should obligate that code blocks be provided to handle each option (or a default for any unhandled option) -- where relevant -- of a UNION type.

Of course, a lazy developer can simply leave the obligatory code blocks empty, thus effectively not handling errors, options, you-name-it. But the empty code blocks are explicit and easily identified, and (at least) the program won't crash at run-time.

I think we've discussed all this before, here. Haven't we?

Not that I recall.

I recall Antc and I having a discussion about obligatory code paths for handling options in a UNION, with (if I recall correctly) side discussions around obligatory code paths for errors, etc. But that's going back a few years, and it's quite possible the main discussion and the side discussions were years apart.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Hugh on June 9, 2021, 1:55 pm
Quote from Dave Voorhis on June 9, 2021, 8:25 am
Quote from dandl on June 9, 2021, 12:50 am

The language D described by TTM is reasonably safe. Although it was not an explicit design goal, the value types and requirement for type inference largely guarantee that if a program compiles it will run. The are still a few gaps, such as conversion and arithmetic errors, infinite loops and infinite recursion which might be worth addressing.

On the other hand the operating environment envisaged by TTM is completely unsafe. There is no guarantee that a program that depends on any database feature will run at all, or without error, or indeed perform any useful function. RM Pre 16,17,23,24,25 presume the existence of one or more external databases, each containing database relvars and a catalog, but set down no guiding principles about how they relate to each other. A database might be changed (somehow) so that a program that used to run no longer does so. Two databases might be incompatible as to entity names or types. A database constraint might be changed so that a database no longer satisfies it. And so on.

My first thought was that Rel probably satisfies the requirement, and Dave appears to agree, but it's way out of scope for TTM.  It could have been in scope for Tutorial D, if we had been a little more confident that implementations would appear, but even then I don't think we would have included it.  It's not needed for the tutorial and illustration purposes that were our primary motivation.

I don't think so. Arguably, one of the strengths of SQL is that it is part of the database. TTM could make it clear that

  • "defining and destroying types, operators, variables, constraints" always leaves those things in a consistent state
  • a D program will expect to use types, operators, etc residing in a database, both at compile time and run time
  • if D programs can have a dependency on a database in which they do not reside as a defined operator, set out some requirements as to how a connection is made and dependencies are resolved

There is not much point requiring type inference in RM Pre 18 at compile time, if some other program can redefine that type. These are high level system integrity requirements, similar to those already in TTM. Implementation is another matter.

Andl - A New Database Language - andl.org
Quote from Dave Voorhis on June 9, 2021, 2:30 pm
Quote from dandl on June 9, 2021, 1:50 pm
Quote from Dave Voorhis on June 9, 2021, 8:25 am

My point is that it should not. There is little point banging on about type inference to avoid runtime errors while being silent on catastrophic program failures caused by simply changing the data in the database (or catalog).

Perhaps true, but I'm not sure anyone has a definitive solution -- or at least one solution that applies to all cases. Current solutions depend on specific situations.

And that indeed is the problem. We were speculating on future languages, but here is a serious and current problem and no-one seems to have a solution.

Rel is notionally (or at least started out as) a closed environment akin to Smalltalk, where the code is stored in the database and all dependencies are managed to ensure that if it compiles, it runs.

But then to make it practically useful, I extended it to reference external dependencies like SQL DBMSs via JDBC, CSV files and the like. Utility has obviously increased, but I can no longer guarantee that what compiles, runs. I can guarantee that a native Rel relvar can't be dropped until it has no code references, but I can't guarantee that a JDBC connection to an intermittently-available SQL DBMS which worked yesterday will work today. That currently means a run-time failure. What it should do -- but doesn't do, yet -- is the compiler should obligate the developer provide some code path(s) to safely handle the situation where the external dependency has failed.

It could do better than that. It could say: your program has a dependency on X and is no longer runnable. Rather than wait to get to that part of the program and failing, it could capture dependencies up front. Yes, a programmer might make a positive decision to run and handle the consequences, but the default should be to explicitly manage and track dependencies.

Rel explicitly manages and tracks dependencies, but its priority is to prevent you from creating a non-runnable state in the first place -- but only for internal dependencies.

For external dependencies, I wouldn't want a single solution. Some external dependency failures are minor and temporary and might only prompt (what becomes) a response message like "The current weather forecast is unavailable. Here's the most recent forecast from <timestamp>."

Some dependency failures might disable specific functionality until code is recompiled/rewritten/whatever, but the rest of the application continues running.

Some dependency failures might warrant immediately shutting down and logging failure after sending a text/email to the senior administrator, etc.

I would want the flexibility to handle each external dependency failure as a (possibly) custom solution, hence my suggestion to obligate 'failure' code paths, via checked exceptions or some similar mechanism.

It's dead easy to build in enough features that a skilled programmer can find a solution. What is hard is to make that not happen.The current almost universal paradigm is for the compiler to control the written code more or less tightly, but leave all the responsibility for external dependencies to be resolved by having the programmer add extra code. Knowing what code to write can be extremely challenging.

My question is simply: can we reverse that onus? Can we have languages so designed that the compiler tracks external dependencies and ensures they are handled as we choose, without the program ever failing at runtime? What would it take to achieve that?

You were hostile to the idea of C macros and quite happy to use reflection. Both are equally unsafe, but at least C macros are transparent and visible: you can see what they claim to do. I would ban reflection, and insist that the relevant information is made available at compile time, but safely.

Andl - A New Database Language - andl.org
Quote from dandl on June 10, 2021, 4:16 am
Quote from Dave Voorhis on June 9, 2021, 2:30 pm
Quote from dandl on June 9, 2021, 1:50 pm
Quote from Dave Voorhis on June 9, 2021, 8:25 am

My point is that it should not. There is little point banging on about type inference to avoid runtime errors while being silent on catastrophic program failures caused by simply changing the data in the database (or catalog).

Perhaps true, but I'm not sure anyone has a definitive solution -- or at least one solution that applies to all cases. Current solutions depend on specific situations.

And that indeed is the problem. We were speculating on future languages, but here is a serious and current problem and no-one seems to have a solution.

Rel is notionally (or at least started out as) a closed environment akin to Smalltalk, where the code is stored in the database and all dependencies are managed to ensure that if it compiles, it runs.

But then to make it practically useful, I extended it to reference external dependencies like SQL DBMSs via JDBC, CSV files and the like. Utility has obviously increased, but I can no longer guarantee that what compiles, runs. I can guarantee that a native Rel relvar can't be dropped until it has no code references, but I can't guarantee that a JDBC connection to an intermittently-available SQL DBMS which worked yesterday will work today. That currently means a run-time failure. What it should do -- but doesn't do, yet -- is the compiler should obligate the developer provide some code path(s) to safely handle the situation where the external dependency has failed.

It could do better than that. It could say: your program has a dependency on X and is no longer runnable. Rather than wait to get to that part of the program and failing, it could capture dependencies up front. Yes, a programmer might make a positive decision to run and handle the consequences, but the default should be to explicitly manage and track dependencies.

Rel explicitly manages and tracks dependencies, but its priority is to prevent you from creating a non-runnable state in the first place -- but only for internal dependencies.

For external dependencies, I wouldn't want a single solution. Some external dependency failures are minor and temporary and might only prompt (what becomes) a response message like "The current weather forecast is unavailable. Here's the most recent forecast from <timestamp>."

Some dependency failures might disable specific functionality until code is recompiled/rewritten/whatever, but the rest of the application continues running.

Some dependency failures might warrant immediately shutting down and logging failure after sending a text/email to the senior administrator, etc.

I would want the flexibility to handle each external dependency failure as a (possibly) custom solution, hence my suggestion to obligate 'failure' code paths, via checked exceptions or some similar mechanism.

It's dead easy to build in enough features that a skilled programmer can find a solution. What is hard is to make that not happen.The current almost universal paradigm is for the compiler to control the written code more or less tightly, but leave all the responsibility for external dependencies to be resolved by having the programmer add extra code. Knowing what code to write can be extremely challenging.

My question is simply: can we reverse that onus? Can we have languages so designed that the compiler tracks external dependencies and ensures they are handled as we choose, without the program ever failing at runtime? What would it take to achieve that?

It's almost certainly not the responsibility of the compiler, unless we redefine "compiler" to be something that runs periodically or generates code to run periodically (for an undefined value of "periodically", which could be anywhere between years and a kilohertz rate) to check dependency validity. Or, if the compiler obligates code paths to handle exceptions, which is what I advocate.

It is sometimes the responsibility of modern CI/CD chains, in which dependencies are appropriately wired and/or detected and trigger project/subproject rebuilds as appropriate. That works well when all dependencies are under the control of the CI/CD chain; less well when external dependencies are outside organisation control. Only periodic checking and/or in-code exception handling can, for example, identify and gracefully degrade when vendor's API has today dropped an API call that worked yesterday, or dropped a column (which our code uses) from some SQL table.

You were hostile to the idea of C macros and quite happy to use reflection. Both are equally unsafe, but at least C macros are transparent and visible: you can see what they claim to do. I would ban reflection, and insist that the relevant information is made available at compile time, but safely.

The problem with C macros is they're a half-baked semi-language outside of the control of the language, which can all too easily generate subtle and difficult-to-find errors. They have no semantic awareness, and rely on crude (and often awkward) text replacement.

Reflection is via conventional library routines, and thus entirely under the control of the language. No problem there.

I have no objection to safe compile-time information or compile-time in-language mechanisms. They are the underlying basis for Java annotations. Java annotations aren't perfect by any means and are subject to their own kinds of abuses, but at least they overcome most of the flaws of C macros and their text-replacement, semantically-unaware, outside-of-language ilk.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

My question is simply: can we reverse that onus? Can we have languages so designed that the compiler tracks external dependencies and ensures they are handled as we choose, without the program ever failing at runtime? What would it take to achieve that?

It's almost certainly not the responsibility of the compiler, unless we redefine "compiler" to be something that runs periodically or generates code to run periodically (for an undefined value of "periodically", which could be anywhere between years and a kilohertz rate) to check dependency validity. Or, if the compiler obligates code paths to handle exceptions, which is what I advocate.

The guiding principle is: if it compiles, it runs. But really it's about safety, and in practice it comes down to dependencies.

  • A program with no dependencies written in most languages can still fail, but the possibilities are limited: casts, arithmetic, null pointers, and small changes to the language could avoid those and make the languages safe (or very nearly so).
  • Every dependency is a point of failure, but languages provide little help. You have to code for it, and to test that code you have to find a way to cause that dependency failure. This is not safe.

It is sometimes the responsibility of modern CI/CD chains, in which dependencies are appropriately wired and/or detected and trigger project/subproject rebuilds as appropriate. That works well when all dependencies are under the control of the CI/CD chain; less well when external dependencies are outside organisation control. Only periodic checking and/or in-code exception handling can, for example, identify and gracefully degrade when vendor's API has today dropped an API call that worked yesterday, or dropped a column (which our code uses) from some SQL table.

Exactly, but that's a lot of work and it's not a general solution to the problem. My idea is:

  • language features that require you to declare dependencies
  • explicit system-provided features supporting various methods of 'graceful degradation'
  • any compilation unit can be queried as to the state of its dependencies.

You can write code to check for missing dependencies if you choose, or rely on default behaviour: eg the system might provide a stub API or a dummy column. If you choose to write logic to handle this failure, the system can also trigger that same behaviour for testing purposes.

You were hostile to the idea of C macros and quite happy to use reflection. Both are equally unsafe, but at least C macros are transparent and visible: you can see what they claim to do. I would ban reflection, and insist that the relevant information is made available at compile time, but safely.

The problem with C macros is they're a half-baked semi-language outside of the control of the language, which can all too easily generate subtle and difficult-to-find errors. They have no semantic awareness, and rely on crude (and often awkward) text replacement.

Reflection is via conventional library routines, and thus entirely under the control of the language. No problem there.

Reflection is deeply unsafe, at least as bad as C macros. The library calls are easy enough, but the underlying purpose and intent are to do things that should be expressed in a language and checked by a compiler. Reflection introduces dependencies that are invisible to the compiler, impenetrable to the reader and basically an accident waiting to  happen.

I have no objection to safe compile-time information or compile-time in-language mechanisms. They are the underlying basis for Java annotations. Java annotations aren't perfect by any means and are subject to their own kinds of abuses, but at least they overcome most of the flaws of C macros and their text-replacement, semantically-unaware, outside-of-language ilk.

I have never proposed going back to C macros; the point here is that reflection is just as bad it its own way. I suspect a key part of the problem is that compilers are so hard to write and mostly impossible to modify, so we look for other ways to aolve problems no matter how bad.

Andl - A New Database Language - andl.org
Quote from dandl on June 10, 2021, 4:16 am
Quote from Dave Voorhis on June 9, 2021, 2:30 pm

Rel is notionally (or at least started out as) a closed environment akin to Smalltalk, where the code is stored in the database and all dependencies are managed to ensure that if it compiles, it runs.

But then to make it practically useful, I extended it to reference external dependencies like SQL DBMSs via JDBC, CSV files and the like. Utility has obviously increased, but I can no longer guarantee that what compiles, runs. I can guarantee that a native Rel relvar can't be dropped until it has no code references, but I can't guarantee that a JDBC connection to an intermittently-available SQL DBMS which worked yesterday will work today. That currently means a run-time failure. What it should do -- but doesn't do, yet -- is the compiler should obligate the developer provide some code path(s) to safely handle the situation where the external dependency has failed.

It could do better than that. It could say: your program has a dependency on X and is no longer runnable. Rather than wait to get to that part of the program and failing, it could capture dependencies up front. Yes, a programmer might make a positive decision to run and handle the consequences, but the default should be to explicitly manage and track dependencies.

 

... leave all the responsibility for external dependencies to be resolved by having the programmer add extra code. Knowing what code to write can be extremely challenging.

My question is simply: can we reverse that onus? Can we have languages so designed that the compiler tracks external dependencies and ensures they are handled as we choose, without the program ever failing at runtime? What would it take to achieve that?

The System/38 later re-badged as AS/400 was a database machine. The Operating System knew for each object-database which source files its tables were compiled from; and which source/which schema any application was compiled against. When an application tried to open a table, the O.S. intervened and cross-checked schemas for all tables the application was compiled against. Any mis-matched version anywhere just stopped the application running. The application couldn't get half way through a transaction and then discover its API out of kilter with some called routine. (Schemas for APIs also had to be declared as if database tables, under the same version-control.)

Then IBM brought in the i-Series, alleged to be 'super-400', with gosh! SQL. And they sneakily retired all that O.S. control -- because SQL, apparently. Also because internet and the need for interconnectivity to any heap of old iron. Sounds like the same story as Rel.

You were hostile to the idea of C macros and quite happy to use reflection. Both are equally unsafe, but at least C macros are transparent and visible: you can see what they claim to do. I would ban reflection, and insist that the relevant information is made available at compile time, but safely.

Neither macros nor reflection are proper version-control integrity; they rely on the programmer thinking to check what is (in their view) significant for safety.