Welcome!

PowerBuilder Authors: Chris Pollach, Yeshim Deniz, Jayaram Krishnaswamy, Kevin Benedict, Avi Rosenthal

Related Topics: PowerBuilder

PowerBuilder: Article

Real Programmers Use BLOBs

They are as basic as it gets

They are as basic as it gets. They are the one way we can look at data without restriction, using our own rules to define their existence, their transformation, and their value; they are the molding clay of data. We call them BLOBs. Once we know the structure of a file or any data, we can use a BLOB to mimic or manipulate it and, in some cases, create new files from scratch. What are BLOBs? Why should you care? And what can you use them for? Quite simply, BLOBs are how we handle data in a raw form, sometimes without any in-depth knowledge of their structure, just the knowledge that they exist.

BLOBs As a Window into Alternative Types of Data
It sounds a little like an episode of the "Twilight Zone." I say that DataWindows can be used to examine alternate types of data because BLOB manipulation doesn't limit us to common data such as bitmaps or sounds. We can use the inherent raw data manipulation characteristics of a BLOB to get direct access to data without using external operations such as OCXs and DLLs. It's possible to build libraries of user objects whose sole purpose is to extract and manipulate data from formats not natively supported by PowerBuilder. In creating these libraries, we gain insights into the structure of these "data sources" as well as eliminate our reliance on third-party objects to access them.

When investigating how data is stored in a file, you can check a myriad of sources. Many companies share information about their file formats, and you can directly search the Web or a company's site to gain access to the file formatting information. The BLOB data type is the key between reading in a file as raw data and translating it to usable data. If you do a little digging, you can usually come up with a pretty reasonable definition for a file structure you're interested in. One such example is the interrogation and exploitation of data from a Palm OS-based PDA. With a little digging on the Palm Web site, I discovered documentation that showed the structure of the Palm database files in great detail. Using this information, I constructed methods that allowed me to extract Palm data to a DataWindow and back again.

Identify the Type of File You Wish to Interrogate
This one speaks for itself, but it's much easier when attempting to create a data interrogator using BLOBs if you have a known quantity to deal with. For instance, in the Palm example, I created a database with certain identifiable values such as obscure column names and data for each. This way, when I'm interrogating the data byte by byte, the values will stand out. This gives me more clues as to the true structure of the file. Even though you might have a published map of the file structure, you can never rely 100% on the documents. Sometimes the documentation is a little behind the released product. I know you always keep your systems documentation current with all last-minute data changes, right?

Find a Way to Logically Split the Data
Take some time to develop a strategy for breaking the data down into logical parts. Again referencing the Palm example, break the data into two logical chunks to start: the database header and the database records. From there, take the simplest component, in this case the header data, and split it off of the BLOB using the BLOBMID function, as in:

L_blob_header = blobMid(tot_b,1,79)

This will chunk the entire header to the l_blob_header variable for the first 79 bytes, representing the header space of a PDAToolBox database file. Now, I usually like to see the data, so create an external DataWindow to hold the byte number, the byte, and the character representation of the byte to get a better understanding of how the data is stored (see Figure 1).

See how a quick conversion of the data makes it readable. This is an advisable process when you're writing code to separate a blob into a more usable form. It provides an opportunity to see if there are any undocumented features or markers stored within the file that you may have to deal with. Also note that when dealing with a BLOB directly, you may have a number of unprintable characters as well as known items (like numbers) being stored in a form that is not readily identifiable. In the current example of dealing with data from a Palm device, keep in mind that numeric data is stored as "Big Endian" (most significant byte first); this is different from how the PC handles data, which is "Little Endian" (least significant byte first). No, I didn't make this up; it comes from Gulliver's Travels and the dispute over which end of a soft-boiled egg to eat from (the big end or the little end). There have been many debates over the merits of each method, but for our purposes, it's more important to realize that data is not always stored the way we expect.

Splitting the data comes from a series of BLOBMID statements that allow us to surgically extract portions of the BLOB into manageable pieces of information without changing its structure. Next, for this example, we can concentrate on the record structure of a Palm PDATOOLBOX database or PDB. For more information on this process, along with a detailed example of transferring BLOB data between a Palm and a DataWindow, download Power PDA (PB Link) from my Web site, www.planetdx.com; it's free and contains a user object that breaks down the Palm.PDB file section by section into two DataWindows: one representing the header information and the other representing the record-by-record data from the Palm.

Time to Pay the Piper
Blob manipulation, while impressive at a basic level, comes at a price. For example, try to read a BLOB from one end to another, one bit at a time using the BLOBMID function. Let's take the following code snippet that comes almost directly out of the PowerBuilder Help File (the FileRead PowerScriptfunction) with just a couple of modifications:

<Perform FILEREAD here>

Blob tot_b, blob_byte
For ll_byte = 1 to len(tot_b)
blob_byte = BlobMid(tot_b,ll_byte,1)
Next

We are reading in a file named blob1.bmp (but you could use any file that's not currently in use) that comes in at an unimpressive 1.1MB (plus or minus) (see Figure 2). It reads in to the local BLOB variable in less than one second. On my P4 2.4GHz, performing the BLOBMID to read each byte individually takes over a whopping two hours at about 141.5 bytes/second to process! Note: The graph on the right in Figure 2 shows bytes/second processed over time.

I added a checkbox to suspend the progress report to the screen to lessen the impact of the GUI on the process. Your test times may be different based on processor loading and removal of the progress display, but you get the idea. BLOB manipulation in PowerBuilder is not fast. Therein lies our problem. Although the BLOB has many uses, processing the BLOB takes up enormous amounts of CPU time. But slow or not, using BLOB manipulation gives you a way of interrogating data like no other method. We can control the amount of CPU time by carefully examining the data structures that we are attempting to extract and using that to our advantage. Many times we won't need to process the BLOB byte by byte, but we can extract chunks of data in logical units (records and headers) in order to speed processing. Also keep in mind that if you know a certain chunk of data contains a string; for example, your BLOBMID is taking a larger portion of the BLOB at a time, and your overall BLOB processing speed will increase.

BLOBs as a DataWindow Repository
BLOBs not only hold raw data that we might wish to dissect byte by byte, but can hold large variable amounts of character-based data that exceeds the boundaries of some databases' varchar storage. Consider the BLOB as a repository for data that can be rendered at a local workstation such as DataWindows. While you wouldn't put all your DataWindows in a BLOB table, you might consider using the BLOB table for storing temporary or volatile reports within your system. A good strategy for leveraging BLOBs within your system is to consider an example of creating a generic reporting system that uses data retrieved from BLOBs to create a report on a local user's PC. The advantage of using this method is an increase in overall productivity through a faster turnaround of report-based information to your end users.

Many of us have had last-minute requests from users to provide data in a usable form, but the time required to coach a user through creating the report using a reporting system (e.g., InfoMaker or other third-party reporting tools) will take longer than just creating a DataWindow yourself. Also take into account that this report may need to be run by more than just the person you are talking to. How will you distribute the report? Does a report justify a new software release?

Using BLOBs, you can add the report to all/some users without the expense of redistributing the core application. Consider the following table's create statement:

CREATE TABLE "dba"."reports"
("report_id" bigint NOT NULL DEFAULT autoincrement,
"report_name" char(50) DEFAULT NULL,
"report_blob" long binary DEFAULT NULL ,
PRIMARY KEY ("report_id")) ;

We are creating a table with three columns: a unique report ID, a column for entering a report name, and a BLOB (binary long) column for storing the DataWindow. We create a simple interface for uploading the reports to the database.

This is a relatively simple DataWindow to BLOB interface (see Figure 3). We start by selecting a .PBL that contains the DataWindow(s) we wish to extract and store as BLOBs. From this .PBL, we extract a list of DataWindows using the Library Directory command:

....
ls_reports = LibraryDirectory ( docname,DirDataWindow! )
dw_datawindows.ImportString(ls_reports )
....

This extracts the DataWindow information from the .PBL and imports the list into the external DataWindow for further processing. Coding the double-click event of the first DataWindow allows us to generate both the DataWindow syntax into a multiline edit and a visual representation (preview) of the DataWindow using:

....
mle_syntax.text = LibraryExport(st_pbl.text,
this.GetItemString(row,'name'), ExportDataWindow! )
dw_visual.Create(mle_syntax.text, error_create)
....

Furthermore, we use this same window to save the text from the multiline edit to the database as a BLOB using:

...
lb_report = Blob(mle_syntax.text)

UPDATEBLOB REPORTS SET "REPORT_BLOB" = :lb_report
WHERE "REPORT_ID" = :ll_REPORT_ID
USING SQLCA;
Commit;
...

Now you have your DataWindow syntax saved to the database intact and as a BLOB. The next step is rendering the DataWindow from the database BLOB. Again no great feat of magic here; you're just leveraging the resources at your disposal in order to give the users what they want: a completed report without making them wait for a software release. The following window consists of two DataWindows. The first lists the report IDs and names from the reports database (see Figure 4).

Upon double-clicking the report name:

l_report_id = This.GetItemNumber(row,'report_id')

SELECTBLOB "REPORT_BLOB"
INTO :lb_report
FROM "REPORTS"
WHERE "REPORT_id" = :ll_report_id
USING SQLCA;
li_ret = dw_output.Create(String(lb_report), ls_error)
dw_output.Settransobject(SQLCA)
dw_output.retrieve()

the code retrieves the DataWindow syntax that was saved as a BLOB, and then uses the CREATE statement to render the DataWindow. In the last step, the DataWindow is retrieved and displayed to the user.

If you were to have a window similar to this in your application, you could provide enhanced reports with shorter turnaround times, and even be able to refresh the number of reports available without the user even needing to log out and back in to an application. Also, considering the power of the DataWindow, you could perform more complex data gathering by using a stored procedure as the data source. It's also possible to have a DataWindow with linked retrieval arguments stored in your database. The key to productivity is the ability to store your DataWindow code as a BLOB and regenerate it on the fly. This method is similar to methods employed when using EAServer, more specifically the GetFullState statement; however, we are not transmitting data, only the DataWindow that will be performing the retrieve.

GetFullstate and SetFullState are BLOB manipulation items of a different perspective. They are used to create and decode BLOBs that contain not only the structure of the DataWindow, but the data as well. Primarily used in distributed applications, these functions provide yet another useful BLOB manipulation method for solving real-world problems.

Normally, you would find the SetFullState function in an EAServer type of environment, where data retrieval is being performed on EAServer (presumably closer to the database server than the end user is), transformed into a BLOB by SetFullState, and transmitted back to the client workstation intact. This BLOB contains the DataWindow format, the data, and the update states contained within. This can be especially useful when creating a distributed application, where the user may only have access to data via e-mail or some other disjointed method.

Picture sending a user a list of customers to call and update. The user can have a BLOB created via GetFullState e-mailed to them, and open the BLOB in their local application using SetFullState. From this point, they can perform limited editing against the DataWindow, and e-mail the corrected DataWindow back to the sender for processing. The sender will now be opening a DataWindow that shows the remote users' changes, as well as having the statuses associated with the data contained within. Now PowerBuilder can update this DataWindow against the database. This is a limited-use scenario, because many variables come into play when dealing with a "disconnected" user community; however, BLOBs once again provide an effective alternative in your toolbox of techniques.

Conclusion
The BLOB is an often underutilized but effective tool for both data manipulation and data representation. Its unique attribute of not being bound by conventional structures allows the developer the freedom to use it either as a container for large amounts of their data, or a petri dish within which low-level data manipulation can occur.

Comments (2)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.