| By Boris J. Lipsman | Article Rating: |
|
| April 1, 2000 12:00 AM EST | Reads: |
10,542 |
In developing various report systems, I found it necessary to recall the ad hoc query tool (based on a concept by Sean Rhody discussed in SYS-CON Publications' Secrets of the PowerBuilder Masters 5.0). For those of you who haven't read about it I'll take a minute to describe it briefly.
Cartesian product n (1958) : a set that is constructed from two given sets and comprises all pairs of elements such that the first element of the pair is from the first set and the second is from the second set (Merriam Webster's Collegiate Dictionary, Tenth Edition)
Overview of the Existing System
The ad hoc query tool can be described as a tab control that has five tab folders. These folders allow a user to create a report sequentially (including its definition and serialization for certain RDBMSs).
- Folder 1: Allows users to select columns they would like to include in the report definition
- Folder 2: Provides a way to construct an expression for the report
- Folder 3: The WHERE clause's tab folder
- Folder 4: Enables a user to sort columns that have already been selected for the report
- Folder 5: Presents the current query in a rich text edit control
Everything worked until I asked myself what would happen if my user wanted to get data from more than one table. Submitting this report to the database wouldn't do any good. Simply put, we'd get a Cartesian product.
Cartesian Product Dilemma
What is the Cartesian product all about? To make things easy, I'll give you a pretty straightforward example based on the MS SQL Server and Sybase's SQL Anywhere RDBMSs. Let's say you have two tables - employee and department tables from the Powersoft demo database, which are related through the Primary-Foreign keys - and you want to retrieve the data from those tables. Understanding how joins are processed would help you understand them and figure out why, when you incorrectly state a join, you get unexpected results.
Internally, MS SQL Server and other RDBMSs have a much more sophisticated procedure regarding joins than you might think. The initial stage of the join is the Cartesian product (it's done automatically by the RDBMS). The number of rows in a Cartesian product of two tables is equal to the number of rows in the first table times the number of rows in the second table. The Cartesian product of the employee and department tables is 375 (75 * 5).
Realizing all that, I decided to enhance the existing tool by adding a wizard that would automate the process of joining the table participants and free users from any worries regarding join operations.
Application Overview
I'll concentrate here on the WHERE clause tab folder. I actually considered two different general scenarios. In the first one the user wants to create an ad hoc report that would include, for simplicity's sake, four tables. These tables - Employee, Department, Customer and Sales_Order - don't have a Primary-Foreign key relationship. Based on the conception of the Cartesian product, MS SQL Server will create the relationship once we include the tables in our query. If you leave the WHERE clause off the join, the Cartesian product would contain 30,618,000 rows. Well, this result really makes no sense because it's just half the job. The correct result, however (achieved by including a WHERE clause stipulating which columns were to be matched and the basis on which to match them), would be just 47 rows. When a user has dragged a couple of columns from tables presented in the tree view in the "Columns" tab folder and then wants to provide some WHERE conditions in selecting the WHERE tab folder, creation of a Cartesian product is automatically prevented. If the user wants to run the report with no limitations, checking will be done as well.
The following section introduces a solution based on employing MS SQL Server and Sybase SQL Anywhere sp_fkeys, that is, stored procedures.
Implementation
I guess it's time to describe exactly what I did regarding the ad hoc query tool to make it more robust and effective. After selecting the third tab page, I had to implement processing that would be posted from the selectionchanged event of that page. To help you understand this better, I'll construct my explanation using the ue_no_cartesian_product() method as the shell. This method, which is called in the ue_onselectionchanging event, is given in Listing 1 (all listings can be found on the PBDJ Web site). The first thing we do here is to instantiate the DataStore, n_cst_dstore; we then move one step forward by calling the next method, of_get_fkeys().
Now that we have n_cst_dstore, a DataStore associated with a DataWindow object (instance of this class) that has a Primary Key holder table as the argument, we instantiate one more DataStore class, referenced by the lds_X variable.
These two DataStores have been used to cover various scenarios. The first one is the process to eliminate a Cartesian product, assuming that we're dealing with the Primary Key table holder; the second is if the table we're dealing with is the Foreign Key table holder. We now have two DataStores that cover these two possibilities.
Both DataStores' DataWindow objects are based on sp_fkeys (MS SQL Server's stored procedure), but as you can see in Figure 1, the first DataStore's DataWindow object uses the Primary Key table's name as the argument while the second one uses the Foreign Key table's name.
Assuming that we're dealing with a Primary Key table, we'll retrieve data using the first DataStore. If we have tables associated with the Primary table (the currently processed table) through the Foreign Key relationship, the retrieve should produce a value greater than zero. From here the approach I've taken is pretty straightforward: I simply populate the structure step by step with the following values:
- Primary Key table name
- Primary Key column(s) name(s)
- Foreign Key table name
- Foreign key column(s) name(s)
Once filtering is done, we have the number of joins that have been generated based on the assumption that the current table is used as the Primary Key holder.
Let's take one more step to populate the structure that we'll use as a source to build join statements for table participants. It's important to make sure that we don't insert a join statement that's already been inserted in the structure by the previous processing. The implementation code for the of_get_fkeys() method would look like Listing 2.
So far, we've gone through all the steps necessary to build a string, which will be used eventually as a source to insert join statements for the Primary Key table (see Figure 2).
Since there's no way to know whether we've dealt with the Primary Key owner table or the Foreign Key owner table, we have to cover the other scenario, which is the processing based on the Foreign Key table. What we're doing is retrieving the data from lds_X DataStore (remember, this DataStore is associated with the DataWindow object that has as an argument the Foreign Key table's name). By calling the of_get_fkeys() method, we're verifying that the current table is the Foreign Key holder.
If this table is neither the Primary nor the Foreign Key holder, we'll send a message to the user saying that it has no relationship with any of the tables included in the report definition. After that, the user is supposed to remove this table's column(s) from the report. The remainder of the processing is otherwise basically the same. The implementation code for the of_get_pkeys() method would look like Listing 3.
After all this is done, and all tables have been processed, the only method we need to mention here is the ue_no_cartesian_product, shown in Listing 4. In this event we trigger the ue_insert_join_stmt() event that actually presents the code to insert the necessary join statements in the WHERE clause. This is given in Listing 5. Figure 3 displays the result of clicking the tab folder, "WHERE clause."
Conclusion
The surest sign that you comprehend the solution I've decribed here will be that you try and take my approach further. My solution describes the event when the columns being joined are compared for equality, that is, an equijoin. An educated user, however, armed with an awareness of the various bases on which columns can be matched, can be challenged by generating more sophisticated reports.
Published April 1, 2000 Reads 10,542
Copyright © 2000 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Boris J. Lipsman
Boris J. Lipsman is a senior software engineer for Arris Systems in Berwyn, Pennsylvania. A certified PowerBuilder developer associate, Boris has worked with PowerBuilder since version 3.0. He has an MA in building construction from Kalinin Construction University in Russia. His new interests include the latest Web technologies using Perl, CGI scripting and Java for Linux.
- Why SOA Needs Cloud Computing - Part 1
- The Cloud Transition: What Does It Mean For You?
- Seeding The Cloud: The Future of Data Management
- Cloud Expo and the End of Tech Recession
- Economy Drives Adoption of Virtual Lab Technology
- Sybase Named “Silver Sponsor” of iPhone Developer Summit
- The Cloud Has Cross-Border Ambitions
- Ulitzer Named "New Media" Partner of Greatly Anticipated iStrategy Event in Berlin
- Risks and Enterprise Mobility?
- Steps for Success in Enterprise Mobility?
- Are Mobile Luddites Resisting Mobility?
- Spam
- The Difference Between Web Hosting and Cloud Computing
- Sybase CTO to Speak at 4th International Cloud Computing Expo
- Why SOA Needs Cloud Computing - Part 1
- Five Reasons to Choose a Private Cloud
- The Cloud Transition: What Does It Mean For You?
- The Threat Behind the Firewall
- Seeding The Cloud: The Future of Data Management
- Cloud Expo and the End of Tech Recession
- Tips for Efficient PaaS Application Design
- Economy Drives Adoption of Virtual Lab Technology
- Using the Microsoft Chart Controls in PowerBuilder
- Sybase Named “Silver Sponsor” of iPhone Developer Summit
- Where Are RIA Technologies Headed in 2008?
- PowerBuilder History - How Did It Evolve?
- The Top 250 Players in the Cloud Computing Ecosystem
- Custom Common Dialogs Using SetWindowsHookEx
- DDDW Tips and Tricks
- OLE - Extending the Capabilities of PowerBuilder
- DataWindow.NET How To: Data Entry Form
- Book Excerpt: Sybase Adaptive Server Anywhere
- Sybase ASE 12.5 Performance and Tuning
- Working with SOA & Web Services in PowerBuilder
- Office 2003 Toolbar: A New Look For Your Old PowerBuilder App
- Dynamically Creating DataWindow Objects






























