|By Mauro Oreta||
|July 21, 2003 11:00 AM EDT||
Friday, 5:00 p.m. - my mind's already on the weekend's fishing when the finance manager barges in. "I'm seeing the auditors Monday. I need last year's budget report and this year's forecast by 8:30 a.m. sharp!" Sounds easy. But last time I ran a budget report for one quarter it took over two hours. I could start them now and hope they're ready to print when I come in Monday. Hang on, the database is down at midnight for backup and maintenance. The fishing will have to wait. @#$%!
Something has to be done with our reporting system, so I started a wish list for a better system:
What's stopping us? This article explains why - and shows you how - to build a better reporting system using PowerBuilder.
Limits Imposed by the Architecture
Improvements on a system can be hindered by the limitations of the underlying architecture.
Figure 1 shows a typical two-tier reporting architecture. A PowerBuilder client application wants to run a report. Normally, the application has a menu so the user can select the report and an input screen in which to enter the report parameters. For example, for a sales report from 1 Jan 02 to 31 Jan 02 the user enters the start and end dates and then clicks the OK button to submit the report to the database.
The client application will normally have a preview window that contains a DataWindow that can load the data object for the selected report. A dw_1.Retrieve('1 Jan 02', '31 Jan 02') is executed. The DataWindow waits for the results from the database, and eventually the user should see the report on screen.
No problem with this approach; it's simple and easy to implement, which is why it's the most common technique used. The basic problem with this is that the client can't do anything while waiting for the result from the database. As soon as the user clicks the OK button to submit the report, the pointer changes to an hourglass and the user has to wait until the data retrieval is finished. Not a problem for small reports, but it's unacceptable if you have to wait for two hours or overnight for last quarter's budget report. And remember, you can't switch off your PC while the report is running. What if you need to submit 20 more reports after the budget report? It's gonna be a long night. Get some food, unpack the sleeping bag, and wait, wait, wait.
A Better Reporting Architecture
First of all, what do we want to achieve? The foremost thing we want is for the application to allow us to submit any report (long-running or short-running ones) without forcing us to wait for it to finish. We want to be able to do something else after we've clicked the Submit or OK button, for instance, shut down the PC and go fishing.
Imagine we have a better reporting system. Let's go back to the above real-life scenario and see how it works using a better reporting architecture. The user clicks the OK button to submit the report. The pointer doesn't need to change to an hourglass, because the application doesn't have to wait. This allows the user to do other stuff within the application, even running other reports.
At any time the user can open a window that displays a list of all the reports he or she has submitted. The list will also show the status of each report, which can be "Submitted," "Running," or "Completed" (see Figure 2).
Initially, when the report is submitted, the status is set to "Submitted." When the report is actually running in the database server, the status will show as "Running." Finally, when the report is finished and ready to be viewed/printed, the status becomes "Completed." All this happens somewhere else so the user is not tied to any screen and is free to do other things. The user can submit multiple reports one after the other without waiting between submissions. After submitting all the reports, the user can come back to the above "Manage Reports" window and eventually see all the submitted reports completed and ready to be viewed or printed.
A better reporting system can be achieved by using a three-tier architecture.
Advantages of a Three-Tier Reporting Architecture
Before going any further, let me summarize the advantages of an architecturally improved reporting system.
1. After the report is submitted, the user is not tied to any window - even if the report has to run for hours, but is free to do other tasks.
2. The user can view all submitted reports, even the old ones printed last month.
3. Completed reports can be viewed/printed as many times as the user wants and it doesn't impact the database server because retrieval from the database happens only once.
4. Reports may be scheduled to run at any future date and time. (Requires addition of a "scheduled date and time" field on the report criteria screen.)
5. Multiple reports can be submitted one after the other without the user waiting for the completion of a previously submitted report.
6. Multiple reports can be viewed/printed one after the other or all at the same time. (Select multiple reports from the list and click the Print button. This will send all reports to the printer.)
The Three-Tier Reporting Architecture
These advantages are made possible by using a three-tier reporting architecture. The difference between a two-tier and a three-tier architecture is the middle tier, which in this case consists of an application I call ReportServer. The ReportServer acts as a mediator between the client and the database server. Figure 3 illustrates how this reporting environment works.
Let me explain how it works.
1. The request for a report starts at the client. The user selects a report from a menu, fills up a form to enter the report parameters, then clicks the OK button.
2. When the user clicks OK, the details about the report are sent to a queue. The queue can simply be a table in a database. A request for a report is transformed into a single row in the table. See Table 4 for the row's columns.
After the user clicks the OK button and a record is created in the queue, the client can momentarily forget about the report. The user can do other things.
3. In the meantime, the ReportServer, which can be another PB application, checks the queue for any record whose scheduled_datetime is greater than or equal to the current date time. When it finds one, the ReportServer gets all the details required to submit the request to the database server from the record in the queue. It knows the SQL command or stored procedure needed by the report, the parameters of the report, the database_name, and server_name. Therefore, it should be able to send the SQL and the parameters to the production database server.
4. The SQL command or stored procedure runs on the production database.
5. The result set is returned back to the ReportServer.
6. The ReportServer will process the result set, which is just a text file. It parses this text file by removing the column headings and other SQL messages and transforms it to a tab-delimited text file suitable for loading onto a DataWindow. The processed text file or files (in case there is more than one result set) are then sent to a share folder in a network fileserver identified by the value of the output_location column in the queue table from step 2.
7. Finally, when the client goes to the "Manage Reports" window and sees that the report's status is "Completed" and then clicks on the View button, a preview window in the client application opens. The preview window will have a DataWindow. Now instead of the normal dw_1.Retrieve() command to retrieve the report, dw_1.ImportFile() will be executed because all that's needed is to import the file(s) from the file server where the raw report files are stored.
A summary of the strategy for submitting and retrieving reports using this architecture is shown in Table 1.
Before I discuss the actual implementation, let me point out that the solution I've outlined here is not the only possible one, nor is it the best one. It is, however, a workable one. It's simple and does not require advanced PB programming skills.
To make the discussions simple and easy to understand, I've made the following assumptions/constraints.
Later on I'm sure you'll realize that the above assumptions don't have to be true to get something similar working. For example, the reporting architecture can also work regardless of the client application. The client application can be written in Java, VB, or a Web application; the datasource does not have to be a stored procedure; the result set from the stored procedure can have multiple result sets; and composite reports can be handled as well. However, for now let's go with these assumptions so as not to muddle our main goal. Let's make things as simple as possible. Once you've seen how it's done for a simple report, it's just a matter of extending some of the functionality to cater to more complex ones.
Furthermore, I'll avoid using too much PowerScript code or using the standard PB naming convention of using prefixes in front of function names, just to make things a little more readable, especially for those not familiar with PowerBuilder standards.
Here is an outline of what we will be building:
1. Create a REPORT_DEFINITION table to store the properties of any given report.
2. Create a REPORT_QUEUE table to store details of reports that have been requested by users.
3. Create a report service user object class to contain any function connected with the reports.
4. Create a criteria window for our sample report, the sales report.
5. Create the Manage Reports window.
6. Create the Print Preview window.
7. Create the ReportServer application.
This first step in creating a REPORT_DEFINITION table is not really mandatory. However, I found it quite convenient to have a place in the database where a report's attributes can be easily maintained and retrieved.
Table 2 shows the structure of our table.
For the sample Sales Report, let's assume the record provided in Table 3.
Each request for a report will be saved in this REPORT_QUEUE table. It will contain everything that needs to be known about the report request.
Table 4 shows the structure of the REPORT_QUEUE table
Report Service Class Object
Next we create a user object class for our reporting service. This object will become part of the existing client application. In this object we create functions that will interface with REPORT_DEFINITION and REPORT_QUEUE tables. A few of the functions are:
There are more functions, but we can leave them out of the discussions.
I'm not going into the actual PowerScript codes for each of the above functions, as they should be quite straightforward. Think of this service class object as a black box. If we want something from it, we simply supply the necessary parameters and it spits out the item we requested or performs the desired action(s).
Assume for the moment that you have a menu where you can select the report that you want to generate. From the menu you picked "Sales Report." The clicked event of the menu option can have code that calls the GetCriteriaWindow function to find out the name of the criteria window. Once we know the name, it's a simple call of the Open or OpenSheet commands to open the criteria window. Figure 4 shows what the criteria window for the Sales Report might look like.
The Sales Report accepts a date range. In the figure, the Sales Report is for the period ....01 January 2002 ... 31 January 2002. The third field on the criteria window is a datetime field to tell the report server the date and time the report is due to be run. By the way, this third field is optional. For existing applications where you already have criteria windows built and don't want to change any of them, you can simply assume the scheduled rundate to be the current date time. In other words, you always want the report to run right away. Without the above third field, however, you would have to find other ways of scheduling reports to run at a future date and time.
Now that we've gathered the parameters for the report and know the date and time we want it to run, we just need to store all these in the REPORT_QUEUE table. Therefore, after we click the OK button and perform validations, we can pass all necessary details of the report to the SubmitReport function of the Report service class.
"EXEC sp_rep_sales_report", "1 Jan02, 31 Jan 02",
"05 Feb 02 10:00 pm", "\\salesdept\salesreports\")
In the SubmitReport function in our Report class object, we would have code to generate the next queue_id and then insert a new row into REPORT_QUEUE (see Table 5).
Manage Reports Window
The Manage Reports window allows the user to see a list of all reports that he or she has submitted. The list of reports will have a status column, plus other information (see Figure 2).
The list of submitted reports is retrieved from REPORT_QUEUE. Because REPORT_QUEUE has a user_id column, it should be easy to show only the reports for the current user. One of the reports on the manage reports window will be the "Sales Report" (ID 12345). Initially after submission, the status of the report will be "Submitted," then it will go to "Running." Eventually, assuming everything goes well, the status should change to "Completed." A Refresh button can be used to refresh the DataWindow manually.
Print Preview Window: Viewing the Report
Once the report's status is "Completed," the user can select the report and click the View button. This will open a Print Preview window.
In the center of the window is a DataWindow. The usage of the buttons below the DataWindow is self-explanatory.
The queue_id is passed from the Manage Reports window to the Print Preview window. Code in the open event of the Print Preview window will get report_id from the REPORT_QUEUE table and, with the report_id, it should be easy to get the name of the DataWindow object for the report from the REPORT_DEFINITION table. The data object for dw_1 can then be set. From the REPORT_QUEUE table we can also derive the value of output_location to point us to the location where we can get the text file to load onto dw_1. Finally, we import the text file onto dw_1 using the ImportFile function to display the report on the screen.
What we've built so far are objects on the client application side, the first tier, and the data tier. How the report's status is transformed from "Submitted" to "Running" and finally to "Completed" takes place in another tier - the middle tier. The ReportServer is our middle tier.
The ReportServer will be another application independent of the client application, so we'll create an interface for it.
The interface will have a DataWindow object whose data will come from the REPORT_QUEUE table. A Refresh button is available to manually refresh the DataWindow, a Delete button to manually delete any row in the queue, and a Close button to close the window.
The above DataWindow can have all the columns available on the REPORT_QUEUE table. The ReportServer interface is meant only for systems administrators to help them monitor the reports that are in the queue.
The most important parts of the ReportServer application don't actually need an interface. In fact, you can create the ReportServer application without a visual interface; you might even want to create it as a service.
In the window object (or application object or as global functions or perhaps a user object, I really don't care) create the following functions:
Running a Report Every N Seconds
We need the ReportServer to interrogate the REPORT_QUEUE table at regular intervals to check if there are any reports that are due to be run. We can do this by creating a Timing object, which we'll call n_MyTiming.
In the open event of the ReportServer window, create an instance of n_MyTiming and then activate it as follows:
This command will cause the timer event on the timing object to fire every five seconds. It's up to you to set the actual time interval from five to 10 seconds. Five seconds should be okay. Any longer than 10 seconds may cause users to complain about why reports that took only one second before now take at least 10 times longer.
On the Timer event of the timing object we add code to do the following:
The RunReport function will perform the following actions:
1. Call function SetStatus(12345,"Running") in order to update the row with queue_id 12345. This will change the status of the row from "Submitted" to "Running" and should also update the actual_runstart column with the current datetime.
2. Get the following from REPORT_QUEUE:
Sql_command = "EXEC sp_rep_sales_report"
Report_parameters = "1 Jan02, 31 Jan 02"
No_of_parameters = 2
Database_name = "sales_db"
Server_name = "SYB_PROD"
3. Call a function to parse report_parameters and transform it to the desired number of parameters and format. What we want here is for the function to transform the report_parameters value, which is "1 Jan02, 31 Jan 02", into the following string "'1 Jan 02', '31 Jan 02'". I won't show how this is done. This should be an easy exercise for your string manipulation skills.
4. Construct the complete SQL string that will be sent to the database. In our example, the complete command string is as follows:
"EXEC sp_rep_sales_report '1 Jan02', '31 Jan 02'"
5. Create a SQL file (just an ordinary text file) that contains the SQL command string created in the previous step, plus a few other necessary SQL commands. It will be convenient to use the queue_id as part of the filename for this SQL file as you'll see later. So, in this case, the filename might be "REP12345.sql" with the following lines in it:
EXEC sp_rep_sales_report '1 Jan02', '31 Jan 02'
The first line, "USE sales_db", instructs the database server to use the sales_db database. This is necessary because different users and different reports may be requested on different databases.
6. Construct a command string that will run Interactive SQL (ISQL.EXE). ISQL.EXE is a program that could be run on a client PC to execute SQL.
Let's assume that user_id "reportserver" is a valid database user with the password "whattheheck". In Sybase Open Client, the command string will be as follows (for other DBMSs it may be slightly different):
Command_string = "isql.exe -Ureportserver -Pwhattheheck
-SSYB_PROD -iREP12345.sql -oREP12345.out"
In this command -U means the user is "reportserver", -P means the password is "whattheheck", -S means the server is "SYB_PROD", -i means the input file is "REP12345.sql", and -o means the output file will be "REP12345.out".
7. Finally execute the command string using the RUN command:
When the RUN command is executed, the application won't wait until it's completed because ISQL.EXE is run as another application. What this means is that ReportServer is free to run another report from the REPORT_QUEUE without waiting for the first one to finish. Since we've set the timer event to fire every five seconds, what will happen is that after five seconds the next report in the queue will be retrieved and the processes of running the report as described above starts again. It's therefore possible for one or more ISQL sessions to be open at the same time, each running a different report.
Processing the OUT File
Now we know how ReportServer submits reports to the database server every five seconds. This action will eventually generate OUT files in the local drive of the ReportServer. In the case of queue_id 12345, the resulting OUT file will be REP12345.OUT. What happens with the OUT file?
If you open the OUT file using Notepad, you'll find the raw data for the report. It could look similar to the following:
agent_id agent_name total_salesNotice the column headings and the dash lines below the column labels. The actual data is properly aligned but is delimited by spaces. The last line "(3 rows affected)" tells you the number of rows retrieved. Actual results may vary depending on what DBMS you're using. However, the main thing to point out is that the OUT file simply contains the output from the ISQL command and that it's in a format that's not suitable for loading onto a DataWindow object, hence the need to transform it.
----------- ------------- ---------------
AB001 Alvin Pat 12345.66
CD003 Joe Last 54888.20
AS004 Jerry Codin 2333.43
(3 rows affected)
Processing the OUT file means transforming the above text file so it won't contain any of the column labels, the dashes, or the last line that tells you the number of rows affected, and delimiting the data with tabs. In other words, transform the text file into a tab-delimited text file with no headers, thereby making it suitable for importing into a DataWindow object. The resulting tab-delimited text file will look as follows:
AB001 Alvin Pat 12345.66A tab character delimits each of the column items. Each line ends with a carriage return.
CD003 Joe Last 54888.20
AS004 Jerry Codin 2333.43
I won't go into the coding of how the OUT file is transformed. Again that's left as a programming exercise. Let's just assume that the function called ProcessOutFile will perform the work for us. This function accepts the queue_id of the report (in this case 12345), and then creates a tab-delimited file called REP12345.TXT. Furthermore, after this function creates the tab-delimited TXT file, the function deletes the OUT file.
How and when do we call the ProcessOutFile? We do it from the TimerEvent in the timing object n_MyTiming. In the TimerEvent we add code that will call the function GetNextOutFile, which is a function that will return the smallest queue_id from REPORT_QUEUE whose status is "Running," then checks if the OUT file for this report is free of any file locks. It's assumed that while the ISQL command is running and generating the OUT file, it has a lock on it and therefore is not accessible. This may not be true; therefore, you may have to use some other methods of finding out if the OUT file is finished. For simplicity, let's assume the GetNextOutFile always gives back either 0 if there are no OUT files ready or a valid queue_id of the report whose OUT file is ready to be processed.
The Timer event will now contain additional steps:
1. Call GetNextinQueue to find the queue_id of the report to be generated next.
2. If a queue_id is found, call RunReport, or else proceed to step 3.
3. Call GetNextOutFile to find the queue_id of the report whose OUT file is ready for processing.
4. If no valid queue_id is found, exit from the event. Otherwise, stop the timer by executing n_MyTiming.Stop(). This gives the application time to finish the next steps by not causing subsequent firing of the Timer Event.
5. Call ProcessOutFile, passing to it the queue_id you got from step 3
6. After the OUT file is transformed to a TXT file as described, move the file to the share folder in the network fileserver as specified in REPORT_QUEUE.output_location. In our example, we want REP12345.TXT to go to \\SalesDept\salesreports\.
7. Next, do a SetStatus(12345, "Completed") to update the status of the relevant row to "Completed" in REPORT_QUEUE and do some cleanup work by deleting the temporary work file "REP12345.SQL".
8. Last, restart the timing object by reexecuting n_myTiming.Start(5) and exit from this event.
Figure 5 provides the flowchart of what the ReportServer application will be doing through the timing object.
After the tab-delimited file for a particular report is moved to a shared folder in the network, the job of the ReportServer ends for that particular report. It's now up to the client application to locate the tab-delimited file and import it for viewing in the print preview window.
If you look more closely at the solution I've outlined, you'll realize that nothing in it is rocket science. The components of the solution are very simple. Programming-wise it's junior-level stuff.
We created two standalone applications - the original client and the ReportServer. If you think about the client application and forget about the ReportServer application for a moment, what do you have? You have an application that simply creates a new row in a table (REPORT_QUEUE). Inserting rows in a table is a normal activity. Nothing extraordinary about this. The other thing the client application now does is access a text file from a network resource and load it onto a DataWindow. Again, nothing fantastic about it.
Now think about the ReportServer application and forget about the client application. What do you have? You have an application that gets a row from a table (REPORT_QUEUE). A column in the table has a SQL command in it. The ReportServer sends the SQL command to the database server, which generates an output file. The application performs some string and file manipulation to create another text file and moves the file onto a network fileserver. Again basic, ordinary stuff.
Two applications, when combined, form a highly useable reporting environment. And yet this is just the beginning. There are enhancements that can easily be added onto the basic solution I've outlined. A few ideas are:
1. Enhancements to handle composite/nested reports.
2. Enhancements to handle stored procedure/SQL commands that generate multiple result sets.
3. An add-on component to transform the SQL output to formats other than ASCII text, such as XML, HTML, and RTF.
4. Viewing error reports - if the report did not generate successfully but instead generated errors, you may want a new status of "Error." When the user tries to view a report with status "Error," the user sees the error messages.
5. Automatic notification of error events - when an error occurs while trying to generate the report, the ReportServer automatically sends an e-mail to TXT message to a system administrator or a help desk.
6. Automatic expiration of reports - set reports to expire after a specified time. Once expired, let ReportServer delete the report and its associated text files.
7. Convert the ReportServer into a service.
8. For purposes of redundancy and scalability, you may want to run multiple ReportServers at the same time. This ensures that if one is down or busy, another ReportServer can immediately take the load.
9. Enhancements to handle recurring reports, which are reports you want to run at regular time intervals, e.g., daily, weekly, or monthly reports.
Now you know the basic architecture, it's up to you to enhance it.
- Where Are RIA Technologies Headed in 2008?
- PowerBuilder History - How Did It Evolve?
- Creation and Consumption of Web Services with PowerBuilder
- Cloud People: A Who's Who of Cloud Computing
- DDDW Tips and Tricks
- Cloud Expo 2011 East To Attract 10,000 Delegates and 200 Exhibitors
- Working with SOA & Web Services in PowerBuilder
- Dynamically Creating DataWindow Objects
- Cloud Expo, Inc. Announces Cloud Expo 2011 New York Venue
- OLE - Extending the Capabilities of PowerBuilder