ChatGPT implementation of the relational model.

#11 · April 13, 2023, 12:02 pm

Quote from AntC on April 13, 2023, 11:28 am

I guess the code Dave shows was copied holus bolus from somewhere.

In a sense, yes, and in a sense, no. It might be copied as-is, or it might be a translation via word proximity probability, which actually produces pretty good results (for a rather limited view of "pretty good.")

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#12 · April 13, 2023, 2:59 pm

The point of my post is that this is the best arms-length description of how it works that I've found. Find me a better one, I'll read it. Wolfram is smart, in business, and likes the sound of his own voice. He's still worth a read, and often has really snazzy diagrams.

Once you understand what it's doing it's bleeding obvious it can't generate good code. It can act as a kind of smart snippets provider ('write me a DeQue in Rust') to get you started, but the code is almost guaranteed to be badly wrong on some level. You save some typing, but you may well have more debugging than if you wrote it yourself, especially if it has to fit into an existing codebase.

Andl - A New Database Language - andl.org

#13 · April 14, 2023, 7:45 am

Oh the Haskell mailing list tried a similar experiment. I didn't pay much attention, the discussion has continued into this month.

ChatGPT did actually produce a whole program, and it compiles, and produces the right output -- at least in straightforward tests. I guess the exercise would be at Undergrad level.

OTOH, binary search on a Haskell List is preposterous and pointless: you can only access the list from head stepping through each elem to its end. So a 'dumb' linear scan is much more efficient than the 'answer'. Calculating length xs has already stepped all the way through the list to count the elements. Accessing the middle element (xs !! mid) has already stepped all the way through from 0 to mid.

Of course Haskell has much more efficient data structures for ordered (or orderable) sets. I'd expect any Undergrad to point that out.

The only time this code might be more efficient than the 'dumb' scan is if it's computationally very expensive to evaluate elements. (Haskell being lazy by default.) OTOH (again), how would you be sure your list is in sorted sequence unless you've evaluated each element as you inserted it?

#14 · May 2, 2023, 12:49 pm

Quote from dandl on April 13, 2023, 2:59 pm

The point of my post is that this is the best arms-length description of how it works that I've found. Find me a better one, I'll read it. Wolfram is smart, in business, and likes the sound of his own voice. He's still worth a read, and often has really snazzy diagrams.

Once you understand what it's doing it's bleeding obvious it can't generate good code. It can act as a kind of smart snippets provider ('write me a DeQue in Rust') to get you started, but the code is almost guaranteed to be badly wrong on some level. You save some typing, but you may well have more debugging than if you wrote it yourself, especially if it has to fit into an existing codebase.

A description that's at least shorter https://simonwillison.net/2023/Apr/2/calculator-for-words/

#15 · May 2, 2023, 1:21 pm

Fascinating. It certainly doesn't cover the same ground, but it does add a perspective I haven't seen before. Thanks!

Andl - A New Database Language - andl.org

#16 · May 3, 2023, 1:38 am

Quote from AntC on May 3, 2023, 1:38 am

Quote from tobega on May 2, 2023, 12:49 pm

A description that's at least shorter https://simonwillison.net/2023/Apr/2/calculator-for-words/

Hmm, no that doesn't match how I think of these Chat things: with a calculator, there's a very specific 'syntax' for asking the question; the onus is on the asker to ask in the right way; there's only one correct answer. What I mean by 'onus on the asker' is it's very easy to mess up your operators or the parens. The calculator will happily 'calculate' whatever formula you entered.

Even with very simple requests to a Chat thing, there's typically many possible answers/many phrasings. We (humans) might say the set of 'right' answers amount to the same thing; yet the nearly-right answers might each be wrong in different ways. There's no way to systematically 'transform' one right answer to another -- as there is with formulae in maths.

Want them to work with specific facts? Paste those into the language model as part of your original prompt!

I think most users of Chat things (and certainly their marketing) expect the thing comes already 'knowing' a lot of facts already. Specifically, not only what amounts to 'general knowledge' in a pub quiz, but also deep knowledge of (in this thread's context) algorithms or industry common practice. The sort of things humans very quickly forget, but can easily look up in references to jog their memory.

The alarming behaviour that I've seen is that where the thing doesn't have facts (but somehow intuits it 'ought' to know), it quite readily makes stuff up. So for those examples the article gives 'Summarization', 'rewrites', etc: I'd be worried it smuggled in a few non-facts/claims that weren't in the original. So I'd end up reading/checking the whole 'answer'. In which case, I might as well have done it for myself anyway.

So that's how it goes with programming tasks: did the thing follow the algorithm I instructed? Is it aware of exceptions and 'that'll never happen' combinations that from my programming experience I know will happen one day? Did it produce clean, readable code with appropriate naming? Did it produce one big morass of code or break it down into sensible sub-components? etc. By the time I've checked its output, I've written the code anyway.

Quote from tobega on May 2, 2023, 12:49 pm

A description that's at least shorter https://simonwillison.net/2023/Apr/2/calculator-for-words/

Hmm, no that doesn't match how I think of these Chat things: with a calculator, there's a very specific 'syntax' for asking the question; the onus is on the asker to ask in the right way; there's only one correct answer. What I mean by 'onus on the asker' is it's very easy to mess up your operators or the parens. The calculator will happily 'calculate' whatever formula you entered.

Even with very simple requests to a Chat thing, there's typically many possible answers/many phrasings. We (humans) might say the set of 'right' answers amount to the same thing; yet the nearly-right answers might each be wrong in different ways. There's no way to systematically 'transform' one right answer to another -- as there is with formulae in maths.

Want them to work with specific facts? Paste those into the language model as part of your original prompt!

I think most users of Chat things (and certainly their marketing) expect the thing comes already 'knowing' a lot of facts already. Specifically, not only what amounts to 'general knowledge' in a pub quiz, but also deep knowledge of (in this thread's context) algorithms or industry common practice. The sort of things humans very quickly forget, but can easily look up in references to jog their memory.

The alarming behaviour that I've seen is that where the thing doesn't have facts (but somehow intuits it 'ought' to know), it quite readily makes stuff up. So for those examples the article gives 'Summarization', 'rewrites', etc: I'd be worried it smuggled in a few non-facts/claims that weren't in the original. So I'd end up reading/checking the whole 'answer'. In which case, I might as well have done it for myself anyway.

So that's how it goes with programming tasks: did the thing follow the algorithm I instructed? Is it aware of exceptions and 'that'll never happen' combinations that from my programming experience I know will happen one day? Did it produce clean, readable code with appropriate naming? Did it produce one big morass of code or break it down into sensible sub-components? etc. By the time I've checked its output, I've written the code anyway.

#17 · May 3, 2023, 2:23 am

Asking it to write code may not be the best way to see what it can do (particularly if it's tiny niche). Why not try a question on the RA, on which there is very large amount written?

Like: how does the relational algebra, as originally proposed by Codd but now greatly extended, compare with the capabilities of a modern dialect of SQL?

Andl - A New Database Language - andl.org

#18 · May 3, 2023, 2:24 pm

Quote from Dave Voorhis on May 3, 2023, 2:24 pm

Quote from dandl on May 3, 2023, 2:23 am

Asking it to write code may not be the best way to see what it can do (particularly if it's tiny niche). Why not try a question on the RA, on which there is very large amount written?

Like: how does the relational algebra, as originally proposed by Codd but now greatly extended, compare with the capabilities of a modern dialect of SQL?

Let's ask ChatGPT:

Relational algebra and SQL are two different approaches to working with relational databases.

Relational algebra is a formal language for describing and manipulating relational data. It consists of a set of mathematical operations on relations, such as projection, selection, union, intersection, and join. These operations can be combined to form more complex queries that extract, transform, and aggregate data from one or more tables.

SQL, on the other hand, is a programming language that is widely used for managing relational databases. It has evolved over time and has many features that are not found in the original relational algebra, such as support for null values, window functions, and stored procedures.

While SQL is a more powerful and expressive language than the original relational algebra, it is still based on the same relational model and many of its core features are derived from relational algebra. In fact, many SQL queries can be translated directly into relational algebra expressions.

Overall, while SQL has evolved beyond the original relational algebra, it remains a powerful tool for working with relational data and is the standard language for managing databases in the industry.

Let's ask Google Bard:

Relational algebra is a formal mathematical theory of relations that provides a foundation for the relational database model. It is a declarative language, which means that it describes what the result of a query should be, rather than how to compute it. Relational algebra is composed of a set of operators that can be used to construct queries.

SQL is a more practical language for expressing queries against relational databases. It is a procedural language, which means that it describes how to compute the result of a query. SQL provides a number of features that are not available in relational algebra, such as the ability to create and modify tables, the ability to insert, update, and delete rows, and the ability to join tables together.

The following table compares the capabilities of relational algebra and SQL:

Feature Relational Algebra SQL

Declarative vs. procedural Declarative Procedural

Operators Projection, selection, union, intersection, difference, product, join Projection, selection, union, intersection, difference, join, create table, insert, update, delete, views, triggers, stored procedures

Data types Tuples (rows) and attributes (columns) Tuples (rows), attributes (columns), primary keys, foreign keys, constraints

Indexes Not supported Supported

Transactions Not supported Supported

Security Not supported Supported

As you can see, SQL is a more powerful language than relational algebra. It provides a wider range of features and is more suitable for expressing complex queries. However, relational algebra is a more elegant language and is better suited for theoretical work.

In practice, most people use SQL to interact with relational databases. SQL is a well-established language with a wide range of support from database vendors. Relational algebra is less commonly used, but it can be useful for understanding the underlying theory of relational databases.

Rubbage.

Quote from dandl on May 3, 2023, 2:23 am

Asking it to write code may not be the best way to see what it can do (particularly if it's tiny niche). Why not try a question on the RA, on which there is very large amount written?

Like: how does the relational algebra, as originally proposed by Codd but now greatly extended, compare with the capabilities of a modern dialect of SQL?

Let's ask ChatGPT:

Relational algebra and SQL are two different approaches to working with relational databases.

Relational algebra is a formal language for describing and manipulating relational data. It consists of a set of mathematical operations on relations, such as projection, selection, union, intersection, and join. These operations can be combined to form more complex queries that extract, transform, and aggregate data from one or more tables.

SQL, on the other hand, is a programming language that is widely used for managing relational databases. It has evolved over time and has many features that are not found in the original relational algebra, such as support for null values, window functions, and stored procedures.

While SQL is a more powerful and expressive language than the original relational algebra, it is still based on the same relational model and many of its core features are derived from relational algebra. In fact, many SQL queries can be translated directly into relational algebra expressions.

Overall, while SQL has evolved beyond the original relational algebra, it remains a powerful tool for working with relational data and is the standard language for managing databases in the industry.

Let's ask Google Bard:

Relational algebra is a formal mathematical theory of relations that provides a foundation for the relational database model. It is a declarative language, which means that it describes what the result of a query should be, rather than how to compute it. Relational algebra is composed of a set of operators that can be used to construct queries.

SQL is a more practical language for expressing queries against relational databases. It is a procedural language, which means that it describes how to compute the result of a query. SQL provides a number of features that are not available in relational algebra, such as the ability to create and modify tables, the ability to insert, update, and delete rows, and the ability to join tables together.

The following table compares the capabilities of relational algebra and SQL:

Feature Relational Algebra SQL

Declarative vs. procedural Declarative Procedural

Operators Projection, selection, union, intersection, difference, product, join Projection, selection, union, intersection, difference, join, create table, insert, update, delete, views, triggers, stored procedures

Data types Tuples (rows) and attributes (columns) Tuples (rows), attributes (columns), primary keys, foreign keys, constraints

Indexes Not supported Supported

Transactions Not supported Supported

Security Not supported Supported

As you can see, SQL is a more powerful language than relational algebra. It provides a wider range of features and is more suitable for expressing complex queries. However, relational algebra is a more elegant language and is better suited for theoretical work.

In practice, most people use SQL to interact with relational databases. SQL is a well-established language with a wide range of support from database vendors. Relational algebra is less commonly used, but it can be useful for understanding the underlying theory of relational databases.

Rubbage.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#19 · May 3, 2023, 11:30 pm

The ChatGPT response is pretty impressive. As an answer to a suitable exam question, you'd get a clear pass, maybe a B-.

Bard is a fail. E-.

Andl - A New Database Language - andl.org

#20 · May 4, 2023, 5:46 am

Quote from dandl on May 3, 2023, 11:30 pm

The ChatGPT response is pretty impressive. As an answer to a suitable exam question, ...

I think you mean "as an answer parroting the misinformation in typical textbooks." So yeah, given most academics [present company excepted, of course] actually don't have a clue about Relational Algebra, the answer parrots back their misunderstandings, so a Pass.

[ChatGPT] In fact, many SQL queries can be translated directly into relational algebra expressions.

Yep, this claim surfaces all the time on StackOverflow. The poor dweebs go away weeping when told this is 'rubbage' [to quote Dave].

[ChatGPT] SQL has evolved beyond the original relational algebra, ...

As if they somehow started from the same place and diverged -- like C++ growing from C. Not even like Pascal "evolved beyond" Algol.

Key points I'd expect to find in an adequate answer is:

SQL includes a text-based command language interfacing to a DBMS.
A sub-part of SQL is an expression language for querying tables, possibly storing the result as a new table within the DBMS.
The capabilities of RA correspond most closely to SQL's expression language.
RA is an algebra formed from regular expressions; as such, an expression can be transformed by purely syntactical manipulation into an equivalent expression producing the same result for same database content. [**]
The expression sub-language within SQL is not an algebra; sub-expressions cannot reliably be transformed by purely syntactical manipulation; therefore it is not in general possible to 'translate' an SQL query into RA, nor v.v..
'Null' so-called 'values' ... sorry, I can't even ...

[**] Good grief! wikipedia seems to lack an article on what the 'Algebra' means in 'Relational Algebra'. In Mathematica? Stanford Encyclpedia? I can't be bothered to go searching.

The Forum for Discussion about The Third Manifesto and Related Matters