The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

"And the moral is, don't use Excel as a scientific database ..."

PreviousPage 4 of 5Next

Thanks for the link.

These are packages heavily oriented towards making sense of arrays of numbers. They don't readily lend themselves to processing either lines of text or tables of data values. They are very restricted on data types (mainly text, number, date, category).

Knime is Java and lives in Eclipse. It's clunky like Access to use and feels old, but I did get some good charts relatively easily. It presents a classic programming model: drag and drop, set params, run, view, repeat.

Orange is very much in Python land, with an idiosyncratic UI (presumably Python-ish). I found it very easy to get into and both the model and the visuals are nice, but the visual processing looks to make even quite simple things laborious. Orange is more interactive than Knime. You can easily set up a pipeline and watch the results. You can select points on a chart and feed them into further analysis.

In both the thing you create and work on is the pipeline, not the data. It seems both set out to be visual programming languages for a data pipeline, and I think that's a hard row to hoe. They fall short of what I'm looking for.

Questions:

  • what were your students trying to do, for which Knime was better than Orange?
  • are there any other contenders?
  • why would someone building one of these use Eclipse rather than roll-your-own? For example, why not do Rel in Eclipse?
Andl - A New Database Language - andl.org
Quote from dandl on February 1, 2020, 7:58 am

Thanks for the link.

These are packages heavily oriented towards making sense of arrays of numbers. They don't readily lend themselves to processing either lines of text or tables of data values. They are very restricted on data types (mainly text, number, date, category).

Knime is Java and lives in Eclipse. It's clunky like Access to use and feels old, but I did get some good charts relatively easily. It presents a classic programming model: drag and drop, set params, run, view, repeat.

Orange is very much in Python land, with an idiosyncratic UI (presumably Python-ish). I found it very easy to get into and both the model and the visuals are nice, but the visual processing looks to make even quite simple things laborious. Orange is more interactive than Knime. You can easily set up a pipeline and watch the results. You can select points on a chart and feed them into further analysis.

In both the thing you create and work on is the pipeline, not the data. It seems both set out to be visual programming languages for a data pipeline, and I think that's a hard row to hoe. They fall short of what I'm looking for.

KNIME is a heavy-duty industrial data analysis tool -- roughly competing with Alteryx and sharing space with R and SAS and used widely in industry. It's relatively new; its predecessor was a proprietary data mining tool used in the pharmaceutical industry from about 2004. First open release was 2006. It's considered industrial-strength.

Orange has been around since 1996. I believe it's more for students and research data processing (which tends not to be high volume or velocity), though I don't think it says so explicitly. It's not really considered industrial-strength, at least not outside of research and teaching.

Both are analytics tools, which generally assume the data already exists in databases or other sources. They're about analysing existing data. Manipulating it or entering new data, not really; it's not what they're made for.

Questions:

  • what were your students trying to do, for which Knime was better than Orange?
  • are there any other contenders?
  • why would someone building one of these use Eclipse rather than roll-your-own? For example, why not do Rel in Eclipse?

I don't know what students were trying to do. I didn't teach the class where they used KNIME, Orange, Tableau, etc. I spoke to the students frequently -- I had certain leadership responsibilities for the overall programme -- but the specific module assignments I didn't pay attention to. They were free to choose their tools in final year and tended to use Python, R, or SAS; occasionally KNIME. It's my partner, Nikki, who used KNIME professionally.

I don't know what other contenders exist in that space. The ones I can think of are Alteryx (commercial), Qlikview (commercial), Tableau (commercial & free, but more about presentation/visualisation), KNIME (open source), Orange (open source), R (open source), Python (open source) and SAS (commercial). If there are others, I can't think of them. Nikki's out at the moment or I'd ask her. I'll ask her later.

Eclipse is actually a cross-platform, plugin-based, generic application framework. It's best known for being deployed to desktop targets with a set of Java development plugins, and that's what most people think of when they hear "Eclipse."

Take away the Java development plugins, put in something-else plugins, and you've got a cross-platform application for something-else.

Up to version 3.013, I did do Rel in Eclipse. Rel was an Eclipse plugin, deployed as the Eclipse framework with the Rel plugin and without the Java development plugins.

As of 3.013, I stopped doing that. It is now no longer a plugin for the Eclipse framework.

I wasn't relying on enough of the Eclipse framework to make it worth it -- the bits I used were easily replicated -- and significant portions of the framework were unused but the dependency chain made them including them mandatory, despite almost doubling the size of the distribution. That would have been ok, but each new Eclipse release required a tedious, error-prone, and time-consuming process of identifying dependencies to support building cross-platform deployments. Sometimes new releases would break the deployment mechanism.

Tired of that, I wrote my own deployment mechanisms, got rid of the Eclipse framework, and gained a smaller, faster Rel deployment.

But that's just Rel. A number of other application developers find the Eclipse framework does enough for them to be worth it. If you're building a large desktop application that can fit in its model and you're starting from scratch and you're happy to live with its cross-platform deployment quirks -- or you only deploy on the same platform you develop on -- it can save you a lot of effort.

 

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on February 1, 2020, 10:59 am

KNIME is a heavy-duty industrial data analysis tool -- roughly competing with Alteryx and sharing space with R and SAS and used widely in industry. It's relatively new; its predecessor was a proprietary data mining tool used in the pharmaceutical industry from about 2004. First open release was 2006. It's considered industrial-strength.

Having explored if further, I would agree. It seems to be able to handle every task I can think of, including all the ones where I would now use Excel. It has lots of data types, lots of processing options and lots of features. It can do JOIN (but I didn't find ANTIJOIN or PROJECT so far) and text processing, not just maths. For most applications I would say yes, it's better than Excel. Perhaps a lot better.

It's still clunky. There are individual visual components to do simple things like renaming a column or performing a simple calculation. What would take a few keystrokes in Excel involves a sequence of choosing, configuring, laying out and testing. There are probably shortcuts using scripts, but the product as presented is laborious and tedious to use.

And it's so slow! Samples that illustrate basic plots and graphs on small data sets take 10-30 secs or longer to run, mostly creating the final picture. This is crazy! I can't help feeling this is a great set of tools needing a whole new UI.

Orange has been around since 1996. I believe it's more for students and research data processing (which tends not to be high volume or velocity), though I don't think it says so explicitly. It's not really considered industrial-strength, at least not outside of research and teaching.

Both are analytics tools, which generally assume the data already exists in databases or other sources. They're about analysing existing data. Manipulating it or entering new data, not really; it's not what they're made for.

Questions:

  • what were your students trying to do, for which Knime was better than Orange?
  • are there any other contenders?
  • why would someone building one of these use Eclipse rather than roll-your-own? For example, why not do Rel in Eclipse?

I don't know what students were trying to do. I didn't teach the class where they used KNIME, Orange, Tableau, etc. I spoke to the students frequently -- I had certain leadership responsibilities for the overall programme -- but the specific module assignments I didn't pay attention to. They were free to choose their tools in final year and tended to use Python, R, or SAS; occasionally KNIME. It's my partner, Nikki, who used KNIME professionally.

I don't know what other contenders exist in that space. The ones I can think of are Alteryx (commercial), Qlikview (commercial), Tableau (commercial & free, but more about presentation/visualisation), KNIME (open source), Orange (open source), R (open source), Python (open source) and SAS (commercial). If there are others, I can't think of them. Nikki's out at the moment or I'd ask her. I'll ask her later.

Thanks. I found RapidMiner (once free, now seriously expensive), and several others that really only do numbers. KNIME is the stand-out, followed by Orange.

Eclipse is actually a cross-platform, plugin-based, generic application framework. It's best known for being deployed to desktop targets with a set of Java development plugins, and that's what most people think of when they hear "Eclipse."

Take away the Java development plugins, put in something-else plugins, and you've got a cross-platform application for something-else.

Up to version 3.013, I did do Rel in Eclipse. Rel was an Eclipse plugin, deployed as the Eclipse framework with the Rel plugin and without the Java development plugins.

As of 3.013, I stopped doing that. It is now no longer a plugin for the Eclipse framework.

I wasn't relying on enough of the Eclipse framework to make it worth it -- the bits I used were easily replicated -- and significant portions of the framework were unused but the dependency chain made them including them mandatory, despite almost doubling the size of the distribution. That would have been ok, but each new Eclipse release required a tedious, error-prone, and time-consuming process of identifying dependencies to support building cross-platform deployments. Sometimes new releases would break the deployment mechanism.

Tired of that, I wrote my own deployment mechanisms, got rid of the Eclipse framework, and gained a smaller, faster Rel deployment.

Maybe that's what KNIME needs.

But that's just Rel. A number of other application developers find the Eclipse framework does enough for them to be worth it. If you're building a large desktop application that can fit in its model and you're starting from scratch and you're happy to live with its cross-platform deployment quirks -- or you only deploy on the same platform you develop on -- it can save you a lot of effort.

Can you comment on the GUI framework as you've migrated from Eclipse to Rel to Spoing?

Andl - A New Database Language - andl.org
Quote from dandl on January 30, 2020, 4:36 am
Quote from Dave Voorhis on January 29, 2020, 1:00 pm
Quote from dandl on January 29, 2020, 7:55 am
Quote from Dave Voorhis on January 28, 2020, 12:42 pm
Quote from dandl on January 28, 2020, 2:41 am

... In Access you can do these things by writing code. You work on the abstract representation of  your data model, not on the data itself.

In both Excel and Access you do these things by writing code. It's the usual spreadsheet expressions and dialog box settings in Excel, VBA and dialog box settings in reports in Access. The effort is notionally the same, because you're doing the same things.

What tends to be a conceptual hurdle for many users -- at least until they've seen it done (though sometimes after, if they reject the notion) -- is that in Access you put the data in tables and the formatting in reports. Excel conflates the two inseparably, which is fine (sort of...) until you need two reports on one set of data.

Or one report on that set of data over there. Then things fall apart.

It's a useful distinction, but that's not all. I don't want to debate what coding is, except to observe that Access is a lot further along the spectrum than Excel. The distinction I'm making is earlier than yours: it's about separating the data itself from the structure imposed on it. Excel allows you to work with raw data without imposing names, types and other attributes on it: data first; then patterns and structure; then manipulation; then presentation.

Only, Excel doesn't make that distinction -- or makes it awkward to implement (multiple "Sheets" being a typical approach, but not the only one; none are good) -- and it only lets you work with "raw" data in the same idiom as some of the worst weakly-typed languages: It infers type (which is fine, in and of itself) but comingles types in a manner that makes JavaScript seem rigorous and consistent. It almost invariably needs column names for all the useful things like pivot tables, filters, etc., but doesn't control where they're put -- they're decorative, unless in the right place; and (this is the worst part) it blurs presentation, manipulation, patterns & structure, and types into an indistinct mess in a completely ad hoc fashion that ultimately means more work -- usually much more work -- than it should be for anything more complex than the most trivial presentation.

Again, this would be fine purely as the endpoint of data processing -- the last step before, say, printing a report or emailing it as a PDF -- but that's "old school" interaction. Modern corporate analytics is about pipelines of processing, with the ability to tee into the pipe at any point for presentation purposes and no "final" endpoint because output is always input to something else.

That's what needs to be facilitated -- powerful processing pipelines with presentations (aka reports) able to be tee'd off at any point -- not yet more Excel-style conflation of processing, presentation, patterns, manipulation, you-name-it in one big ball of mud.

No argument about the need for something better, but I don't see anything remotely as accessible for those early stages of creating and rearranging tables. Access makes it way too hard to get started, and too hard to just move things around. Given how old Excel and Access are, surely it's time for something better? AFAICT 'modern corporate analytics' is quite well served by heavyweight products extracting profit from big data. That's not my interest.

Sorry I've fallen behind this thread, so apologies if somebody's already made this point ...

Microsoft at least at some stage seemed to decide that for grunty (i.e. corporate) "processing pipelines"/presentations/reports, the tool was to be Sharepoint. Your 'raw' data would be in Excel and/or Access and/or some SQL databases; your fancy analysis might be a pivot table, but Sharepoint would do the 'orchestration'/version control/integrity of data feeds/etc. This allows the accountants/data analysts to use their familiar tool (Excel); but the IT department to keep a sanity check on the pipelining.

This perhaps would have avoided the disaster in the Bank that Matt Parker talks about earlier in the video. (But my experience is that Corporate Accountants are far too adept at evading Mordak.)

BTW Matt's favourite tool seems to be PERL; so much for eating your own dogfood.

Quote from AntC on February 2, 2020, 12:49 pm
Quote from dandl on January 30, 2020, 4:36 am
Quote from Dave Voorhis on January 29, 2020, 1:00 pm
Quote from dandl on January 29, 2020, 7:55 am
Quote from Dave Voorhis on January 28, 2020, 12:42 pm
Quote from dandl on January 28, 2020, 2:41 am

... In Access you can do these things by writing code. You work on the abstract representation of  your data model, not on the data itself.

In both Excel and Access you do these things by writing code. It's the usual spreadsheet expressions and dialog box settings in Excel, VBA and dialog box settings in reports in Access. The effort is notionally the same, because you're doing the same things.

What tends to be a conceptual hurdle for many users -- at least until they've seen it done (though sometimes after, if they reject the notion) -- is that in Access you put the data in tables and the formatting in reports. Excel conflates the two inseparably, which is fine (sort of...) until you need two reports on one set of data.

Or one report on that set of data over there. Then things fall apart.

It's a useful distinction, but that's not all. I don't want to debate what coding is, except to observe that Access is a lot further along the spectrum than Excel. The distinction I'm making is earlier than yours: it's about separating the data itself from the structure imposed on it. Excel allows you to work with raw data without imposing names, types and other attributes on it: data first; then patterns and structure; then manipulation; then presentation.

Only, Excel doesn't make that distinction -- or makes it awkward to implement (multiple "Sheets" being a typical approach, but not the only one; none are good) -- and it only lets you work with "raw" data in the same idiom as some of the worst weakly-typed languages: It infers type (which is fine, in and of itself) but comingles types in a manner that makes JavaScript seem rigorous and consistent. It almost invariably needs column names for all the useful things like pivot tables, filters, etc., but doesn't control where they're put -- they're decorative, unless in the right place; and (this is the worst part) it blurs presentation, manipulation, patterns & structure, and types into an indistinct mess in a completely ad hoc fashion that ultimately means more work -- usually much more work -- than it should be for anything more complex than the most trivial presentation.

Again, this would be fine purely as the endpoint of data processing -- the last step before, say, printing a report or emailing it as a PDF -- but that's "old school" interaction. Modern corporate analytics is about pipelines of processing, with the ability to tee into the pipe at any point for presentation purposes and no "final" endpoint because output is always input to something else.

That's what needs to be facilitated -- powerful processing pipelines with presentations (aka reports) able to be tee'd off at any point -- not yet more Excel-style conflation of processing, presentation, patterns, manipulation, you-name-it in one big ball of mud.

No argument about the need for something better, but I don't see anything remotely as accessible for those early stages of creating and rearranging tables. Access makes it way too hard to get started, and too hard to just move things around. Given how old Excel and Access are, surely it's time for something better? AFAICT 'modern corporate analytics' is quite well served by heavyweight products extracting profit from big data. That's not my interest.

Sorry I've fallen behind this thread, so apologies if somebody's already made this point ...

Microsoft at least at some stage seemed to decide that for grunty (i.e. corporate) "processing pipelines"/presentations/reports, the tool was to be Sharepoint. Your 'raw' data would be in Excel and/or Access and/or some SQL databases; your fancy analysis might be a pivot table, but Sharepoint would do the 'orchestration'/version control/integrity of data feeds/etc. This allows the accountants/data analysts to use their familiar tool (Excel); but the IT department to keep a sanity check on the pipelining.

Yes. I remember when Sharepoint was announced with fanfare at my last workplace, and it certainly got some use -- mainly as a document/documentation repository -- and as a host for the occasional appallingly-written, difficult-to-use, buggy application. The IT director suggested at one point that our students would make great Sharepoint developers and he'd take them on, but that idea petered out quickly. The next time getting students into the IT department came up, there was mention of all manner of things but only glancing mention of Sharepoint, if at all.

I don't see it in either the place that cuts may paycheques or the place I write code for. Sharepoint is apparently available; I just don't see it. Maybe it's not used much, or maybe it's used heavily by departments I have nothing to do with. I don't know.

The closest thing I see in heavy use -- at both places -- is Atlassian Confluence. There are plugins available to visualise the output of data processing pipelines. I don't know what produces the pipelines. It might be Tableau since I sometimes hear it mentioned.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on February 2, 2020, 4:23 am
Quote from Dave Voorhis on February 1, 2020, 10:59 am

KNIME is a heavy-duty industrial data analysis tool -- roughly competing with Alteryx and sharing space with R and SAS and used widely in industry. It's relatively new; its predecessor was a proprietary data mining tool used in the pharmaceutical industry from about 2004. First open release was 2006. It's considered industrial-strength.

Having explored if further, I would agree. It seems to be able to handle every task I can think of, including all the ones where I would now use Excel. It has lots of data types, lots of processing options and lots of features. It can do JOIN (but I didn't find ANTIJOIN or PROJECT so far) and text processing, not just maths. For most applications I would say yes, it's better than Excel. Perhaps a lot better.

It's still clunky. There are individual visual components to do simple things like renaming a column or performing a simple calculation. What would take a few keystrokes in Excel involves a sequence of choosing, configuring, laying out and testing. There are probably shortcuts using scripts, but the product as presented is laborious and tedious to use.

And it's so slow! Samples that illustrate basic plots and graphs on small data sets take 10-30 secs or longer to run, mostly creating the final picture. This is crazy! I can't help feeling this is a great set of tools needing a whole new UI.

It's "enterprise" software, so unless there's a really compelling use case for having a good UI and good performance, it'll have neither.

I work on stuff where operations per second can be measured in Hertz and have acceptable time bounds specified in milliseconds, but outside of that niche, if something is slow it a perfect excuse to go get a fresh cup of coffee.

Orange has been around since 1996. I believe it's more for students and research data processing (which tends not to be high volume or velocity), though I don't think it says so explicitly. It's not really considered industrial-strength, at least not outside of research and teaching.

Both are analytics tools, which generally assume the data already exists in databases or other sources. They're about analysing existing data. Manipulating it or entering new data, not really; it's not what they're made for.

Questions:

  • what were your students trying to do, for which Knime was better than Orange?
  • are there any other contenders?
  • why would someone building one of these use Eclipse rather than roll-your-own? For example, why not do Rel in Eclipse?

I don't know what students were trying to do. I didn't teach the class where they used KNIME, Orange, Tableau, etc. I spoke to the students frequently -- I had certain leadership responsibilities for the overall programme -- but the specific module assignments I didn't pay attention to. They were free to choose their tools in final year and tended to use Python, R, or SAS; occasionally KNIME. It's my partner, Nikki, who used KNIME professionally.

I don't know what other contenders exist in that space. The ones I can think of are Alteryx (commercial), Qlikview (commercial), Tableau (commercial & free, but more about presentation/visualisation), KNIME (open source), Orange (open source), R (open source), Python (open source) and SAS (commercial). If there are others, I can't think of them. Nikki's out at the moment or I'd ask her. I'll ask her later.

Thanks. I found RapidMiner (once free, now seriously expensive), and several others that really only do numbers. KNIME is the stand-out, followed by Orange.

I'd forgotten about RapidMiner. I've never seen it in use.

Eclipse is actually a cross-platform, plugin-based, generic application framework. It's best known for being deployed to desktop targets with a set of Java development plugins, and that's what most people think of when they hear "Eclipse."

Take away the Java development plugins, put in something-else plugins, and you've got a cross-platform application for something-else.

Up to version 3.013, I did do Rel in Eclipse. Rel was an Eclipse plugin, deployed as the Eclipse framework with the Rel plugin and without the Java development plugins.

As of 3.013, I stopped doing that. It is now no longer a plugin for the Eclipse framework.

I wasn't relying on enough of the Eclipse framework to make it worth it -- the bits I used were easily replicated -- and significant portions of the framework were unused but the dependency chain made them including them mandatory, despite almost doubling the size of the distribution. That would have been ok, but each new Eclipse release required a tedious, error-prone, and time-consuming process of identifying dependencies to support building cross-platform deployments. Sometimes new releases would break the deployment mechanism.

Tired of that, I wrote my own deployment mechanisms, got rid of the Eclipse framework, and gained a smaller, faster Rel deployment.

Maybe that's what KNIME needs.

Getting rid of the Eclipse framework would speed up startup, reduce the size of the distribution, and maybe make standard UI interactions like popping up menus and windows faster, but probably wouldn't make a difference anywhere else. It wouldn't have any effect on the slow graphs, which is probably just careless programming like use of linear searches instead of hash lookups, etc.

But that's just Rel. A number of other application developers find the Eclipse framework does enough for them to be worth it. If you're building a large desktop application that can fit in its model and you're starting from scratch and you're happy to live with its cross-platform deployment quirks -- or you only deploy on the same platform you develop on -- it can save you a lot of effort.

Can you comment on the GUI framework as you've migrated from Eclipse to Rel to Spoing?

The Eclipse framework works under SWT -- which is Eclipse.org's Java UI toolkit for desktop applications -- and under RAP/RWT, Eclipse.org's Java UI toolkit for Web applications.

However, application code written for SWT (desktop) almost invariably needs some changes to work under RAP/RWT (web), or vice versa. It's not a lot of changes -- SWT and RAP/RWT are intentionally almost the same -- but converting an application can take time and is potentially error-prone. For most development, that's not a problem -- you're either targeting desktop or Web and you know which it will be beforehand, and you don't target both or switch from one to the other.

But I like to make applications that will run on -- within reason -- anything, and I have a philosophical inclination toward abstracting away platforms, so I made Spoing.

Spoing encapsulates and abstracts away the differences between SWT and RAP/RWT so that you can write an application with one source base and target the six deployment targets (Windows Desktop, Windows Web, Linux Desktop, Linux Web, MacOS Desktop, MacOS Web) at once.

I haven't converted Rel to use Spoing yet. It still uses proto-Spoing code that became the desktop targets of Spoing, but I will convert it soon.

In the mean time, I've been working on a data abstraction layer that auto-generates record classes from data sources (like SQL queries for arbitrary DBMSs via JDBC, CSV files, Excel spreadsheet files, etc.) to make it easy to use Java Streams on arbitrary data sources and query results. It also provides data update mechanisms.

It's for use in my evolving datasheet application, which also uses Spoing. Like Spoing, the database abstraction layer will be released as a standalone open source project. I've called it WrapD (pronounced "wrapped" and "rapid".) It's not yet publicly exposed on GitHub.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on February 2, 2020, 5:13 pm

It's "enterprise" software, so unless there's a really compelling use case for having a good UI and good performance, it'll have neither.

Comparing Orange to KNIME the differences are striking. But getting cuddly with Python is not on my todo list. It may be time to go brush up on my Eclipse and Java.

I work on stuff where operations per second can be measured in Hertz and have acceptable time bounds specified in milliseconds, but outside of that niche, if something is slow it a perfect excuse to go get a fresh cup of coffee.

Been there, done that, but not my niche. Too much crap to deal with.

Getting rid of the Eclipse framework would speed up startup, reduce the size of the distribution, and maybe make standard UI interactions like popping up menus and windows faster, but probably wouldn't make a difference anywhere else. It wouldn't have any effect on the slow graphs, which is probably just careless programming like use of linear searches instead of hash lookups, etc.

I think there's an issue with the execution model. There are real delays right along the pipeline, which just don't make sense on such small samples.

But that's just Rel. A number of other application developers find the Eclipse framework does enough for them to be worth it. If you're building a large desktop application that can fit in its model and you're starting from scratch and you're happy to live with its cross-platform deployment quirks -- or you only deploy on the same platform you develop on -- it can save you a lot of effort.

Can you comment on the GUI framework as you've migrated from Eclipse to Rel to Spoing?

The Eclipse framework works under SWT -- which is Eclipse.org's Java UI toolkit for desktop applications -- and under RAP/RWT, Eclipse.org's Java UI toolkit for Web applications.

I found Wikipedia and this article informative: http://alblue.bandlem.com/2005/02/eclipse-difference-between-swt-and-awt.html. OK, got it. I guess that explains some of the really awkward Java UIs I've used, which just don't feel right on Windows. That would be Swing talking. Eclipse is far better: the menu shortcuts work like they should, whereas in many products they don't (Orange they didn't either).

[BTW I like XAML better, but you get what you get.]

So throwing away Eclipse and keeping SWT is no big deal?

However, application code written for SWT (desktop) almost invariably needs some changes to work under RAP/RWT (web), or vice versa. It's not a lot of changes -- SWT and RAP/RWT are intentionally almost the same -- but converting an application can take time and is potentially error-prone. For most development, that's not a problem -- you're either targeting desktop or Web and you know which it will be beforehand, and you don't target both or switch from one to the other.

But I like to make applications that will run on -- within reason -- anything, and I have a philosophical inclination toward abstracting away platforms, so I made Spoing.

Spoing encapsulates and abstracts away the differences between SWT and RAP/RWT so that you can write an application with one source base and target the six deployment targets (Windows Desktop, Windows Web, Linux Desktop, Linux Web, MacOS Desktop, MacOS Web) at once.

I haven't converted Rel to use Spoing yet. It still uses proto-Spoing code that became the desktop targets of Spoing, but I will convert it soon.

In the mean time, I've been working on a data abstraction layer that auto-generates record classes from data sources (like SQL queries for arbitrary DBMSs via JDBC, CSV files, Excel spreadsheet files, etc.) to make it easy to use Java Streams on arbitrary data sources and query results. It also provides data update mechanisms.

It's for use in my evolving datasheet application, which also uses Spoing. Like Spoing, the database abstraction layer will be released as a standalone open source project. I've called it WrapD (pronounced "wrapped" and "rapid".) It's not yet publicly exposed on GitHub.

Thanks, that all makes sense. Not a fan of code generation, but you do what you have to and the goal is worthy.

Andl - A New Database Language - andl.org
Quote from dandl on February 3, 2020, 2:54 am
Quote from Dave Voorhis on February 2, 2020, 5:13 pm

It's "enterprise" software, so unless there's a really compelling use case for having a good UI and good performance, it'll have neither.

Comparing Orange to KNIME the differences are striking. But getting cuddly with Python is not on my todo list. It may be time to go brush up on my Eclipse and Java.

I wonder if there's something wrong or something set -- like a debug mode or something -- or a "warm-up" effect. Warm-up effects are common in large Java applications, as Java normally dynamically loads classes from files. The first time the classes are loaded, this can cause dramatic slowdowns compared to normal operation.

KNIME is used for terabyte-sized data processing, so performance problems with dozens of records would be a bit surprising unless it's the first process of the day. It should be fast thereafter.

I work on stuff where operations per second can be measured in Hertz and have acceptable time bounds specified in milliseconds, but outside of that niche, if something is slow it a perfect excuse to go get a fresh cup of coffee.

Been there, done that, but not my niche. Too much crap to deal with.

Getting rid of the Eclipse framework would speed up startup, reduce the size of the distribution, and maybe make standard UI interactions like popping up menus and windows faster, but probably wouldn't make a difference anywhere else. It wouldn't have any effect on the slow graphs, which is probably just careless programming like use of linear searches instead of hash lookups, etc.

I think there's an issue with the execution model. There are real delays right along the pipeline, which just don't make sense on such small samples.

No it doesn't (unless it's warm-up.)

But that's just Rel. A number of other application developers find the Eclipse framework does enough for them to be worth it. If you're building a large desktop application that can fit in its model and you're starting from scratch and you're happy to live with its cross-platform deployment quirks -- or you only deploy on the same platform you develop on -- it can save you a lot of effort.

Can you comment on the GUI framework as you've migrated from Eclipse to Rel to Spoing?

The Eclipse framework works under SWT -- which is Eclipse.org's Java UI toolkit for desktop applications -- and under RAP/RWT, Eclipse.org's Java UI toolkit for Web applications.

I found Wikipedia and this article informative: http://alblue.bandlem.com/2005/02/eclipse-difference-between-swt-and-awt.html. OK, got it. I guess that explains some of the really awkward Java UIs I've used, which just don't feel right on Windows. That would be Swing talking. Eclipse is far better: the menu shortcuts work like they should, whereas in many products they don't (Orange they didn't either).

To achieve utmost platform portability, most Java UI toolkits work on an OS canvas and do all the rendering on the Java side. SWT (and maybe a few others?) uses native widgets to provide a native look-and-feel, at the expense of some potential portability issues.

Technical users seem to bifurcate into mouse-ists who click their way through GUIs (no keyboard shortcuts used) and commandline-ists who do everything through a commandline (no keyboard shortcuts needed.) GUI shortcut-ists are rare enough that most developers don't consider them, and many GUI toolkits require that the developer explicitly wire them up. Many don't even think of it. I don't, unless reminded. I alternate between being a mouse-ist and being a commandline-ist. I work with a shortcut-ist who sometimes watches what I'm doing and helpfully points out that, "You can get that with Ctrl-Shift R... You can do that with Alt-Ctrl-Shift-P... Next time, hold down Ctrl-RightShift-P and Tab then Ctrl-X Esc followed by LeftShift-Alt-Ctrl-Q then L and that'll pop right up", etc. I can't remember those things.

[BTW I like XAML better, but you get what you get.]

I suppose XML or XML-derived UI layout languages like XAML, FXML, etc. are fine for relatively non-technical designers, but I find them irritating -- because £$%@!! XML -- and because you can only achieve further abstraction by generating it from something else. If I want to generate ten similar controls, I want to create a method to generate a control instance and invoke it in a loop. How do you do that with XAML?

I know there are things like ItemsControl that notionally provide some equivalent, but as a programmer, if I'm creating UI's for language x I prefer to have the full power of language x to simplify and automate the process.

So throwing away Eclipse and keeping SWT is no big deal?

It depends on the use case. If you're using SWT widgets but not the Eclipse framework and you're happy to write scripts to pick the appropriate platform-specific SWT libraries at launch-time, then it's no big deal. If you're heavily invested in the Eclipse framework and want all the platform-specific launch code done for you, then it's probably a big deal.

However, application code written for SWT (desktop) almost invariably needs some changes to work under RAP/RWT (web), or vice versa. It's not a lot of changes -- SWT and RAP/RWT are intentionally almost the same -- but converting an application can take time and is potentially error-prone. For most development, that's not a problem -- you're either targeting desktop or Web and you know which it will be beforehand, and you don't target both or switch from one to the other.

But I like to make applications that will run on -- within reason -- anything, and I have a philosophical inclination toward abstracting away platforms, so I made Spoing.

Spoing encapsulates and abstracts away the differences between SWT and RAP/RWT so that you can write an application with one source base and target the six deployment targets (Windows Desktop, Windows Web, Linux Desktop, Linux Web, MacOS Desktop, MacOS Web) at once.

I haven't converted Rel to use Spoing yet. It still uses proto-Spoing code that became the desktop targets of Spoing, but I will convert it soon.

In the mean time, I've been working on a data abstraction layer that auto-generates record classes from data sources (like SQL queries for arbitrary DBMSs via JDBC, CSV files, Excel spreadsheet files, etc.) to make it easy to use Java Streams on arbitrary data sources and query results. It also provides data update mechanisms.

It's for use in my evolving datasheet application, which also uses Spoing. Like Spoing, the database abstraction layer will be released as a standalone open source project. I've called it WrapD (pronounced "wrapped" and "rapid".) It's not yet publicly exposed on GitHub.

Thanks, that all makes sense. Not a fan of code generation, but you do what you have to and the goal is worthy.

I generally avoid code generation but after considering all the trade-offs, it presented the most tolerable set of compromises. It's nice to be able to write nicely integrated database-driven application code like this, with bare SQL queries and no ORM pseudo-language, complexity, or suboptimal SQL generation:

database.query("SELECT * FROM $$tester WHERE x > ? AND x < ?", TestSelect.class, 3, 7)
  .forEach(tuple -> System.out.println("[TEST] " + tuple.x + ", " + tuple.y));

Or this:

database.queryForUpdate("SELECT * FROM $$tester WHERE x >= ?", TestSelect.class, 10)
  .forEach(tuple -> {
    if (tuple.x >= 12 && tuple.x <= 13) {
      tuple.x *= 100;
      tuple.y *= 100;
      try {
        tuple.update(database, "$$tester");
      } catch (SQLException e) {
        System.out.println("Update failed.");
      }
    }
  });

The final form will hopefully be simpler.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on February 3, 2020, 10:19 am
Quote from dandl on February 3, 2020, 2:54 am
Quote from Dave Voorhis on February 2, 2020, 5:13 pm

It's "enterprise" software, so unless there's a really compelling use case for having a good UI and good performance, it'll have neither.

Comparing Orange to KNIME the differences are striking. But getting cuddly with Python is not on my todo list. It may be time to go brush up on my Eclipse and Java.

I wonder if there's something wrong or something set -- like a debug mode or something -- or a "warm-up" effect. Warm-up effects are common in large Java applications, as Java normally dynamically loads classes from files. The first time the classes are loaded, this can cause dramatic slowdowns compared to normal operation.

KNIME is used for terabyte-sized data processing, so performance problems with dozens of records would be a bit surprising unless it's the first process of the day. It should be fast thereafter.

Could well be. There is a strong data-processing model rather than interactive model, so maybe 1 row or a million might incur the same per-node delay rather than per-row delay. I'd like to to take a peek, but getting the code into Eclipse is giving me trouble.

To achieve utmost platform portability, most Java UI toolkits work on an OS canvas and do all the rendering on the Java side. SWT (and maybe a few others?) uses native widgets to provide a native look-and-feel, at the expense of some potential portability issues.

Apparently AWT started with the native model but with a common subset approach, which is very restrictive. Swing went with Java rendering (+AWT) so many newer controls don't work like native. Early Swing was slow, so SWT went native superset largely for performance reasons, but the result is (should be) better native behaviour. Swing is popular, but SWT should now be the better choice.

Technical users seem to bifurcate into mouse-ists who click their way through GUIs (no keyboard shortcuts used) and commandline-ists who do everything through a commandline (no keyboard shortcuts needed.) GUI shortcut-ists are rare enough that most developers don't consider them, and many GUI toolkits require that the developer explicitly wire them up. Many don't even think of it. I don't, unless reminded. I alternate between being a mouse-ist and being a commandline-ist. I work with a shortcut-ist who sometimes watches what I'm doing and helpfully points out that, "You can get that with Ctrl-Shift R... You can do that with Alt-Ctrl-Shift-P... Next time, hold down Ctrl-RightShift-P and Tab then Ctrl-X Esc followed by LeftShift-Alt-Ctrl-Q then L and that'll pop right up", etc. I can't remember those things.

Not me. I avoid command line at all costs, and rely on GUI+mouse for discovery and then switch to GUI+keyboard shortcuts for speed. I always prefer keyboard for software I use often. Example: I use git routinely, via 4 or 5 different GUIs, never the command line (which is an absolute stinker). Case in point: this editor has no keyboard shortcuts for full screen, format as code, or Post. If it did, I would use them, always.

[BTW I like XAML better, but you get what you get.]

I suppose XML or XML-derived UI layout languages like XAML, FXML, etc. are fine for relatively non-technical designers, but I find them irritating -- because £$%@!! XML -- and because you can only achieve further abstraction by generating it from something else. If I want to generate ten similar controls, I want to create a method to generate a control instance and invoke it in a loop. How do you do that with XAML?

I know there are things like ItemsControl that notionally provide some equivalent, but as a programmer, if I'm creating UI's for language x I prefer to have the full power of language x to simplify and automate the process.

Mostly XAML works really well, and shows you your document structure nicely, just like HTML. It's equally easy to do the same things in code, or to mix them. I would go for JSX, which I find works really well.

I generally avoid code generation but after considering all the trade-offs, it presented the most tolerable set of compromises. It's nice to be able to write nicely integrated database-driven application code like this, with bare SQL queries and no ORM pseudo-language, complexity, or suboptimal SQL generation:

database.query("SELECT * FROM $$tester WHERE x > ? AND x < ?", TestSelect.class, 3, 7)
  .forEach(tuple -> System.out.println("[TEST] " + tuple.x + ", " + tuple.y));

Or this:

database.queryForUpdate("SELECT * FROM $$tester WHERE x >= ?", TestSelect.class, 10)
  .forEach(tuple -> {
    if (tuple.x >= 12 && tuple.x <= 13) {
      tuple.x *= 100;
      tuple.y *= 100;
      try {
        tuple.update(database, "$$tester");
      } catch (SQLException e) {
        System.out.println("Update failed.");
      }
    }
  });

The final form will hopefully be simpler.

That looks a lot like I write in LINQ. Can you do it on things that are not already SQL? How do the type conversions work? NULLs?

 

Andl - A New Database Language - andl.org
Quote from dandl on February 3, 2020, 11:24 pm

I generally avoid code generation but after considering all the trade-offs, it presented the most tolerable set of compromises. It's nice to be able to write nicely integrated database-driven application code like this, with bare SQL queries and no ORM pseudo-language, complexity, or suboptimal SQL generation:

database.query("SELECT * FROM $$tester WHERE x > ? AND x < ?", TestSelect.class, 3, 7)
  .forEach(tuple -> System.out.println("[TEST] " + tuple.x + ", " + tuple.y));

Or this:

database.queryForUpdate("SELECT * FROM $$tester WHERE x >= ?", TestSelect.class, 10)
  .forEach(tuple -> {
    if (tuple.x >= 12 && tuple.x <= 13) {
      tuple.x *= 100;
      tuple.y *= 100;
      try {
        tuple.update(database, "$$tester");
      } catch (SQLException e) {
        System.out.println("Update failed.");
      }
    }
  });

The final form will hopefully be simpler.

That looks a lot like I write in LINQ. Can you do it on things that are not already SQL? How do the type conversions work? NULLs?

It's a thin -- very thin -- wrapper around JDBC. 'Record' class attribute types, type conversions, and NULLs are determined by JDBC.  I intend to fully test and support the usual MOPSS1 set of DBMSs; not sure if I'll bother with others.

There will be a similar API for non-SQL things. The core code already exists in an earlier iteration of the idea, but I'm not happy with it; it needs a re-write.

--
1 MySQL, Oracle Database, PostgreSQL, Microsoft SQL Server.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 4 of 5Next