Beyond The Four Fours (Spreadsheet vs. Database)

#1 · May 9, 2019, 2:01 pm

I have taken a slight liberty with the Projects page at the TTM web site. I've added "Beyond The Four Fours", an application of Rel giving the results of an investigation by myself and my friend Gerard Joseph that I mentioned a year or so ago in this forum. The four fours problem entails expressing every natural number from 1 to 100 using only an agreed set of operators and symbols. We tried to repeat the exercise for every combination of four digits. You'll need to have Rel installed if you would like to see how I used Tutorial D to record our results, with lots of database constraints to minimize the risk of mistakes, and lots of views to give various statistical analyses. Suggestions for improvements to my code or further analyses would of course be welcome. Gerard's Excel spreadsheet also summarizes the results and gives some of those statistics (the ones of which he feels Excel is capable, of course). Details of the project are given in Beyond The Four Fours.pdf in the zip folder provided for download.

Gerard is based in Australia and we continue to exchange emails daily as we are currently continuing the investigation to see how much further we can get, beyond 100, for each exercise that we managed to completely solve (407 out of the 495 combinations). The time difference between our two countries is convenient, meaning that each partner sends and receives just one set of emails per day such that over-hasty responses and email ping-pongs are kept to a minimum.

Hugh

Coauthor of The Third Manifesto and related books.

#2 · May 16, 2019, 7:14 am

Thank you Hugh.

BTW on the TTM Projects Page there seems to be a typo in the url for the .pdf. The .zip is fine, and the .pdf is inside it.

I had some questions, which I tried to answer for myself by looking through the `LatestBackup.Rel`. But that raised rather more questions ...

There seemed to be no relation with the formulas/solutions(?) Then the database is rather like 'Actors in Search of a Play'.

I would expect Recalcitrant to be a view = Studied NOT MATCHING Solutions. Also UncleanOnly and NonPowerClean to be views over Solutions. But there's no Solutions and those other relvars are REAL. Yet the categories Recalcitrant/UncleanOnly/NonPowerClean/PowerClean are mutually exclusive; if you're not recording solutions, wouldn't that just be a status attribute against Studied? Presumably with those dedicated REAL RELATIONs you had to key in the <i,j,k,l,n> tuples to each, and manually delete the corresponding tuple from its previous status(?) OK there's constraints declared to make sure the mutually exclusivity holds, but that seems rather a band-aid.

I also saw REAL RELATIONs for Odds, Evens, MultipleOf10, RecsWith1 and various other 'static'/constant characteristics of numbers or of tuples. Is Rel not very good at arithmetic?

Contrast that I built a Sudoku solver in SQL (well, MsAccess). I have a standard base table in my kit-bag which has a single column with the INTs up to some large value. Everything else I generated as views and Cartesian Products over that. Then it was easy to scale up the solver from 4 × 4 grids to 9 × 9 and beyond.

#3 · May 16, 2019, 11:05 am

Quote from Hugh on May 16, 2019, 11:05 am
[Text added later in this colour.]

Thanks, Antc. Typo fixed (why didn't I test that link?)

Gerard and I didn't need to record our solutions while the project was in process and neither of us was up to collecting them together retrospectively when we decided to publish. After all there are 49500 - 430 such solutions, spread over two-and-a-bit years-worth of emails, and we didn't write them all down. Some exercises were so easy that one of us would just send any numbers that had proved to be to difficult for him (or claim a clean sweep) and the other would confirm and perhaps fill in some gaps.

We are writing them all down for the current extension but usually in short form such as

[206:226] = f(1,3,4,5) = 216 ± x (= 6³ [PC] when x = 0)

227 = f(1,3,4,5) = 243 – 16

Those expressions rely on both of us remembering various two-digit solutions we (that is to say, Gerard) have collected: here, how to get 216 from 1 and 3 or 1 and 4, 243 from 1 and 3 (the square root of the one-tenth root of 3 = 3^5). [PC] is for "power-clean".

I don't like status attributes if they can be easily avoided when the cardinality of the possible values is low. I much prefer JOIN over WHERE. Added June 2nd, 2019: On reflection I find that reason (JOIN vs. WHERE) a bit mysterious. A much better reason is that my design involves the same number of rows but saves a column (excuse the vernacular), making data entry a little easier. The predicate expressions are easier to wrote and understand, too. I'm reminded of my aversion to the use of BOOLEAN attributes in base relvars. Also, perhaps I should have asked Antc why the use of a status attribute might be preferred. Something to do with performance, considering the need for my exclusion constraints? I'm not (much) influenced by that kind of consideration given the kinds of databases I build these days. I have thought of a title for a paper I probably won't get around to writing: "A Relational Model of Date for Small, Private Data Banks". End of added text, June 2nd, 2019.

Typing in the recalcitrants and other failures was easy using Rel's edit feature.

Yes, the real relvars RecsWith1, RecsWith2, etc., should really be views and I can't remember why I chose to make them by assigning a query result instead.

Rel is pretty good at arithmetic but I couldn't find its MOD operator (if it exists) and in any case again I prefer JOIN over EXTEND + WHERE where feasible and typing in the tuple values for Odds, Even, and MultipleOf10 was easy using copy-and-paste.

Congratulations on your Sudoku solver in SQL!

Hugh

[Text added later in this colour.]

Thanks, Antc. Typo fixed (why didn't I test that link?)

Gerard and I didn't need to record our solutions while the project was in process and neither of us was up to collecting them together retrospectively when we decided to publish. After all there are 49500 - 430 such solutions, spread over two-and-a-bit years-worth of emails, and we didn't write them all down. Some exercises were so easy that one of us would just send any numbers that had proved to be to difficult for him (or claim a clean sweep) and the other would confirm and perhaps fill in some gaps.

We are writing them all down for the current extension but usually in short form such as

[206:226] = f(1,3,4,5) = 216 ± x (= 6³ [PC] when x = 0)

227 = f(1,3,4,5) = 243 – 16

Those expressions rely on both of us remembering various two-digit solutions we (that is to say, Gerard) have collected: here, how to get 216 from 1 and 3 or 1 and 4, 243 from 1 and 3 (the square root of the one-tenth root of 3 = 3^5). [PC] is for "power-clean".

I don't like status attributes if they can be easily avoided when the cardinality of the possible values is low. I much prefer JOIN over WHERE. Added June 2nd, 2019: On reflection I find that reason (JOIN vs. WHERE) a bit mysterious. A much better reason is that my design involves the same number of rows but saves a column (excuse the vernacular), making data entry a little easier. The predicate expressions are easier to wrote and understand, too. I'm reminded of my aversion to the use of BOOLEAN attributes in base relvars. Also, perhaps I should have asked Antc why the use of a status attribute might be preferred. Something to do with performance, considering the need for my exclusion constraints? I'm not (much) influenced by that kind of consideration given the kinds of databases I build these days. I have thought of a title for a paper I probably won't get around to writing: "A Relational Model of Date for Small, Private Data Banks". End of added text, June 2nd, 2019.

Typing in the recalcitrants and other failures was easy using Rel's edit feature.

Yes, the real relvars RecsWith1, RecsWith2, etc., should really be views and I can't remember why I chose to make them by assigning a query result instead.

Rel is pretty good at arithmetic but I couldn't find its MOD operator (if it exists) and in any case again I prefer JOIN over EXTEND + WHERE where feasible and typing in the tuple values for Odds, Even, and MultipleOf10 was easy using copy-and-paste.

Congratulations on your Sudoku solver in SQL!

Hugh

Coauthor of The Third Manifesto and related books.

#4 · May 16, 2019, 11:36 am

Quote from Hugh on May 16, 2019, 11:05 am

Rel is pretty good at arithmetic but I couldn't find its MOD operator (if it exists)

It does:

6 % 3
0

6 % 2
0

6 % 4
2

25 % 7
4

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#5 · June 2, 2019, 12:17 pm

I hope I'm not out of order in using Edit to add some text to my response to Antc, dated May 16th, 2019. It seemed the easiest and clearest way of expressing myself for the purpose at hand. I don't know (yet) whether forum followers get an alert when an existing post is edited.

Hugh

Coauthor of The Third Manifesto and related books.

#6 · June 3, 2019, 11:07 am

Quote from Hugh on June 2, 2019, 12:17 pm

I don't know (yet) whether forum followers get an alert when an existing post is edited.

Hugh

I've switched off alerts (that's a merciful improvement with the Wordpress site, so I don't know.) The Forum front page does show activity on this thread in the past 24 hours, which is what I noticed. (Was that the editting or Hugh's additional comment? We'll never know.)

On reflection I find that reason (JOIN vs. WHERE) a bit mysterious.

Me too. But I didn't want to say.

perhaps I should have asked Antc why the use of a status attribute might be preferred. Something to do with performance, considering the need for my exclusion constraints?

Not to do with performance -- that's a matter of the utmost unimportance.

I first said not "status" but "... categories Recalcitrant/UncleanOnly/NonPowerClean/PowerClean are mutually exclusive." We'd know which category a Studied is in if the database recorded Solutions. And examining the Solutions would give the status -- i.e. that would be a derived attribute (and I share your distaste for recording statuses direct). I went on

> if you're not recording solutions, wouldn't that [category] just be a status attribute against Studied?

So I'm only contemplating a status attribute as proxy for (a query over) the Solutions wot we do not 'ave.

> Recalcitrant to be a view = Studied NOT MATCHING Solutions.

Would indeed be JOIN-like rather than WHERE-like.

"A Relational Model of Date for Small, Private Data Banks".

Is that a Freudian typo? Presumably your domestic meter readings databases are as much dates as data/Temporal databases ;-)

#7 · June 3, 2019, 11:17 am

Quote from AntC on June 3, 2019, 11:07 am

"A Relational Model of Date for Small, Private Data Banks".

Is that a Freudian typo? Presumably your domestic meter readings databases are as much dates as data/Temporal databases

Yes, a (possibly Freudian) typo. Thanks for your reply, Antc, and sorry for any misunderstanding on my part. Glad we agree on the performance issue, though I admit a solution such as mine would probably be impractical for a very large commercial database using a typical SQL implementation.

Hugh

Coauthor of The Third Manifesto and related books.

#8 · June 12, 2019, 10:55 pm

Quote from Hugh on June 3, 2019, 11:17 am

Quote from AntC on June 3, 2019, 11:07 am

"A Relational Model of Date for Small, Private Data Banks".

Is that a Freudian typo? Presumably your domestic meter readings databases are as much dates as data/Temporal databases

I actually read it as "A Relational Model of [Chris] Date" until I got a bit further along.

#9 · June 13, 2019, 1:39 am

Quote from AntC on May 16, 2019, 7:14 am

Contrast that I built a Sudoku solver in SQL (well, MsAccess). I have a standard base table in my kit-bag which has a single column with the INTs up to some large value. Everything else I generated as views and Cartesian Products over that. Then it was easy to scale up the solver from 4 × 4 grids to 9 × 9 and beyond.

I wrote a Sudoku solver in Andl, triggered by having found one in SQL that ran to nearly 1000 lines of intricate code. Mine is about 20 lines of code. It was just a proof of concept so I didn't try to make it generic, but I don't think that would be too hard.

At the core is WHILE, which roughly corresponds to RECUSRIVE CTE in SQL. I'm not sure if a similar approach could work in Rel, which has no WHILE but has recursion.

Andl - A New Database Language - andl.org

#10 · June 13, 2019, 9:57 am

Quote from dandl on June 13, 2019, 1:39 am

Quote from AntC on May 16, 2019, 7:14 am

Contrast that I built a Sudoku solver in SQL (well, MsAccess). I have a standard base table in my kit-bag which has a single column with the INTs up to some large value. Everything else I generated as views and Cartesian Products over that. Then it was easy to scale up the solver from 4 × 4 grids to 9 × 9 and beyond.

I wrote a Sudoku solver in Andl, triggered by having found one in SQL that ran to nearly 1000 lines of intricate code. Mine is about 20 lines of code. It was just a proof of concept so I didn't try to make it generic, but I don't think that would be too hard.

At the core is WHILE, which roughly corresponds to RECUSRIVE CTE in SQL. I'm not sure if a similar approach could work in Rel, which has no WHILE but has recursion.

RECURSIVE CTE and WHILE seem like convenient mechanisms where the language focuses on evaluating expressions à la SQL. With Tutorial D / Rel being a conventional imperative language with conventional recursive functions, it seems unnecessary. There are various recursive, backtracking Sudoku solvers written in various conventional imperative languages (e.g., see https://www.codesdope.com/blog/article/solving-sudoku-with-backtracking-c-java-and-python/) so there's no reason why Tutorial D / Rel couldn't be used to do the same.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

The Forum for Discussion about The Third Manifesto and Related Matters

Beyond The Four Fours (Spreadsheet vs. Database)