|By Chris Pollach||
|August 10, 2009 02:45 PM EDT||
Yes size really does matter when you are talking about a client/server application, web server, or application server communicating with a database management system. Size is often overlooked when designing very large application systems that can potentially generate an excessive amount of concurrent users, heavy database transaction load, complex queries, large data stream content or some type of mixture of the above.
Recently, I had the opportunity to put this aspect under the magnifying glass when tuning a client's ASE (Application Server Enterprise) database that was operating at over 1 million input / output requests per hour. The ASE server was receiving requests from PowerBuilder native windows applications, JSP and ASP.NET web pages, and EAServer (Sybase's application server) clusters. The transaction mixture was diverse, from small single row queries to complex medium result set queries and large Binary (BLOB) requests. These query transactions often progressed into an impressive amount of small to medium updates as well back to the DBMS. Add to this web activity of over 10,000 users or 500,000 sessions per hour, EAServer logging messages into its Message Queue table in ASE, and PDF files being saved and retrieved into Blob columns and you can imagine the workload on the DBMS.
In this article I will outline the approach I took, the test findings and the final results of the modifications to the various application designs and the performance model changes that were observed. I would also like to mention that this aspect of performance tuning is not restricted to Sybase's ASE DBMS. The various approaches, anticipated and finally observed changes to the performance aspects can be achieved with any application written in other languages like Java, C#, VB, etc., and DBMSs from vendors like Oracle, IBM, Informix, and Microsoft. The key to this article then is to alert the developer, network specialist or DBA to the possibility that this singular factor can have a dramatic impact on what the application user sees in the overall application response times.
The basic approach I decided upon was to isolate the three common DBMS connectivity's areas for the average database server. By this I mean your standard native MS-Windows client/sever application, then a web server, and finally an application server. These three realms encompass and source the average mixture of database requests from either Win32 code; active web languages like JSP, PHP, or ASP; and application components that today also encompass "web service" objects that typically implement encapsulated business functionality.
In the generic n-tier system diagram shown in Figure 1, the "destination" machine would represent the DBMS, Host3 could represent an Application or Web Server, Host 1 and 2 could be a firewall (black box and white box), and the Source machine could be an Internet- or intranet-based application. In all of the cases, whether the application is talking directly to the DBMS server or through a Web service or application business component - typically a DBMS transaction is spawned from one of these environments and the network layer is traversed. The network activity is in the form of an information package or packet that holds either an inbound request or an outbound answer. Packets are governed by the Network Operating System (NOS) and must follow the configuration restrictions imposed on them as they travel through the various topographies. Like a chain, the packets response time is only as good as its weakest link in the network path to and from the server. So before you get too focused on a possible specific packet performance bottle neck I would suggest that you converse with your Network Administrator. Ask them about the lay-of-the-land so to speak on the general topography before you get too submersed in a specific area - you might find that an old router or slow bridge could be the root of your problems. Thus, a simple network hardware change could alleviate your performance bottleneck without any application or server changes.
If on the other hand, the network topography is sound, then it may be plausible that the network packet interactions may be a factor in your performance woes. To aid in tuning this area, it is important to remember that the NOS will send another packet for every request or result packet issued. It does this like an email request where the sender can request a receipt confirmation by the receiver. However, the NOS does not make this confirmation an optional item; it makes it mandatory. In Ethernet topologies, it is also possible that the packet does not even make it to the receiver as a "collision" is encountered. The packet may then have to be retransmitted because of the collision or time-out situation as the original packet may never have arrived.
Based on the above scenarios, the frequency of packet time-outs and the response time for the acknowledgment packets is something that you and your network administrator need to analyze by looking at the big picture of what might be happening to your DBMS response times. Time-out situations should be addressed first as to why the NOS is spending al lot of time in this area instead of delivering the original packet effectively in the first transmission go-round.
Packet Size Logistics
It is very important to understand the usage of the packets between the client and the DBMS. The effective use of this communication vehicle is paramount to reducing the overhead of the packetization feature within the Network Operating System. For the ASE DBMS engine - like its competitors - the packet size can be controlled from three areas: application (client), network, and DBMS server. Setting the packet size when you establish a connection to the DBMS server plays out in the efficiency that will exist for the life of that connection. That is to say, selecting a very large packet can efficiently move large blocks of data. However, the acknowledgment packet that is issued for every data packet will be virtually empty, resulting in wasted time and network overhead to route the empty packet acknowledgments back to the sender. The NOS has to transmit the entire packet no matter how full it has been loaded, and this is a key consideration to remember.
In the packet diagram in Figure 2, we can see that in a modern packet structure there is an imposed overhead of a key routing information known as the Packet Header, the data being transmitted, and wasted space that is padded out to properly form the overall packet size. What we as a developer, DBA, or networking person can do is to try and balance the data usage so that we minimize wasted space and excessive acknowledgment packets. The above packet scenario also points out that shipping varying amounts of data may invoke steady two-packet transmissions at a 512 byte size but average out to one packet using a 1,204 byte format. Increasing the packet to 2K or higher can lead to an extra packet of basically wasted space being transmitted. Thus the ideal situation is to know your average data lengths and then set the packet size to optimize this average - especially for 80% of the common SQL transactions that your applications utilize in their daily tasks.
For the best performance, choose a server packet size that works efficiently with the underlying packet size on your network. The goals are:
- Reducing the number of server reads and writes to the network
- Reducing unused space in network packets (increasing network throughput)
For example, if your network packet size carries 1500 bytes of data, setting Adaptive Server's packet size to 1024 (512*2) will probably achieve better performance than setting it to 1536 (512*3).
Below is a response from one of the Sybase Engineers, Jim Egan, about changing the packet size in ASE:
"I started using Sybase SQL Server (back when it was actually called that) a long time ago. I'm a big fan of adjusting the packet size. The packet size you end up with should be balanced by taking into account the way the application uses the database. Applications that deal with the database by sending one or just a few rows at a time tend to do better with smaller packet sizes. PB applications often pull back many rows for each query so a larger packet size will often result in better performance. There is a point where going to a larger packet size results in decreased performance. Keep in mind that that actual packet size on the wire cannot exceed a certain amount. I don't remember the exact amount but it is less than 2K. So using a 2K ASE packet size may result in the creation of two packets on the network, not just one. Using a larger packet size within ASE will reduce the CPU that ASE needs simply because it is accumulating data into a larger buffer and flushing out fewer of those to the network. (That is a bit of a simplification.)
As I recall the packet size setting must be adjusted on both the server and the client. A client can have a larger packet size setting than the client but not the other way around. Additional memory is necessary on the server when using larger packet sizes. For busy networks, a larger packet size often works better. The more packets you have to send the more chance you have of collisions on the network."
The 50,000 Foot View
To get a high-level view of what your DBMS and network load may be costing your applications currently from the packet interaction point-of-view, I would suggest that you look at tools like Quest Software's "Spotlight for ASE". I have used this particular tool on various DBMS systems (they support all the major vendors) and it can pinpoint an area for further analysis faster than a speeding bullet - to borrow an old cliché. I should also mention that there are similar tools from other software monitoring companies as well that can address these aspects. I chose the Spotlight tool in this article as I already have it installed and use it extensively with ASE.
Spotlight is basically a dashboard application that connects directly to your DBMS, operating system, etc., using the appropriate connectivity. In the case of ASE, this would be the MDA tables where the performance information is captured. This then allows Spotlight to introspect the target environment and report its findings using a "dashboard" interface. The dashboard graphically maps o t the architecture and then animates and colorizes these areas and interactions between them for you to see visually.
In the Spotlight interface in Figure 3 I have outlined in red squares the two key pieces of information that I am interested in for possible packet interaction problems. These are the Problem Users (connections) and the amount of packet activity that the server is having. The problem programs are already at a "concern" level - so Spotlight has colorized the entry in yellow. When the packets received or transmitted section is having problems, the animated pipes will turn to yellow, then orange and then into red as problems develop in this area.
Once you notice packetization is an escalating problem area, click the mouse on the packet pipeline and Spotlight will display a pop-up information dialogue and then allow you to "drill down" into this area for specific details (see Figure 4). To do this, just click on the "show me" drilldown and Spotlight will then allow you to look at the packet area being monitored in much greater detail.
Once into the specific suspect area, you can select the problem user / connection, then highlight an application, select the Network tab page, and request the monitored information on packets. This then brings up a detail dialog outlining the current application's condition and below this the packet activity by sent and received granularity. This high-level drilldown feature can easily allow the organization to discern what applications might benefit from a detailed investigation and tuning exercise on packet usage.
This drilldown feature now gets you a view from the 5,000-foot level for each application that creates a lot of packet load on the DBMS server and thus through the network (see Figure 5). Of course, this quick 50,000-foot to 5,000-foot view for problem areas can also be extended to Input/Output, caching, data buffers, etc., parts of the DBMS operations that allow the operations people to quickly pinpoint an area to monitor further or may require a detailed performance review. Spotlight will even make suggestions as to what to change in the NOS or DBMS as far as configuration parameters are concerned to help expedite a fix for the observed problem area. That is a really nice feature that I have used to tune an ASE configuration many times.
For more information on the "Spotlight" product from Quest Software, please visit their website.
Ground Level - The Application
The various applications that could benefit from packet-size optimization should now be on the radar if you have used tools like Quest Software's Spotlight to hone-in on the ones that incur the most interaction with the DBMS. I always use the "law of diminishing returns" to seek out the top 10 to 15 percent of the systems components that might benefit from a new approach, modification, redesign, etc. This is to ensure that you get the most "bang for your buck" as the old cliché says to get the best return on your modification efforts.
For PowerBuilder-based application systems, the ability to drill down to see exactly what is happening from the Packet to data buffer transfers is relatively easy to do using three built-in features: The SQL Trace feature, the Profiling Tool, and the DataWindow's SQL related events. These features enable the PowerBuilder developer to easily get "under the covers" to access more detailed information on what is happening between the application, its hand shake, and the actual DB driver (client) interface. Each of these investigation features will give you some more information on what might be happening, such as where excessive IO or CPU is being consumed that might be packet related when performing intense DML requests.
As a former DBA, I also found that the SQL "projection" construction technique used could either hamper or aide in the total amount of CPU consumed in the packaging and un-packaging of the data to and from the network packet layer. For example, let's look at a fictitious table called TABLE-A that has 10 columns - I will call them COL-1 through COL-10. They were declared by the DBA in the DDL in that specific order. Most DBMS systems will then physically store the data in the same order as the column definitions in the schema. Now at execution time a DML projection is encountered that states: Select COL-10, COL-1, COL-9, COL-5 .... From TABLE-A. In this case the data is stored in COL-1 through COL-10 physical sequence, so the data has to be moved from the DBMS work buffer to the Packet buffer as one move statement for each column. This overhead may also be incurred at the client side as the data is extracted from the network Packet to the local client's data buffer.
In the above SQL example, let's see what would happen if the application developer or DBA create an SQL projection statement that follows the order of the column names as declared in the Schema. So the projection statement would then become: Select COL-1, COL-2, COL-3, COL-4 .... From TABLE-A. This format then allows the DBMS server to move all the data from the work buffer to the packet buffer using one move statement. Actually in the generated pseudo code, this becomes an MVC (WorkArea, BufferArea, TotalLength) command - or basically, from this address to that address move all the data in one consecutive block. This significantly reduces the CPU time to package and un-package the data. This benefit is also witnessed on the client side as the local "C" code in the client can also optimize data movement in this way as well.
One great feature of PowerBuilder that has been available for quite some time is the ability to ask the hand-shake database driver to log the SQL activity that occurs between the business application and the DBMS client. This is often done via middleware connectivity layers like ODBC - but this often misses the full impact of what the application sees. In recent releases, Sybase has even enhanced this feature to add better descriptions and timings in the logging of this information - so now in PB 11.0 or 11.5 (the only supported versions of PB currently) - the trace log is very comprehensive.
To activate the SQL Trace, all the application developer needs to do is assign the keyword "TRACE " to the current value in the SQLCA.DBMS property of the Transaction Object within your application (see Figure 6). For example, if is currently assigned the value SQLCA.DBMS = "ASE" - then changing this to SQLCA.DBMS = "TRACE " + SQLCA.DBMS will set the property to "TRACE ASE". When the DB hand-shake driver in PB encounters this setting in the CONNECT SQL statement, it will start the SQL Trace file. You may also start an SQL Trace by adding a parameter to the DBPARM field of the SQLCA structure. The format of the command is "PBTRACE=1" and will have the same effect as using the TRACE keyword. (You have to use the TRACE keyword even when you use the DBPARM value. The DBPARM value just allows you to control when tracing is done.) Either technique will activate the internal trace mechanism invoking the PBTRAnnn.DLL to assist you in this endeavor. Note that in production, you need to deploy this extra tracing DLL for this feature to work.
Once the application has executed and closed, the SQL Trace file will have a detailed account of the DML statements that were used and more importantly data lengths and timings that handled each row as the DB drivers packaged or unpackaged information to and from the network packets plus the overhead of round-trip requests to the DBMS for data services (see Figure 7). The important information from my perspective is to get the average data length and pinpoint excessive time against problematic DML statements. These measurements can indicate excessive data movement or long request times for specific result sets. The TRACE.log file is a small ASCII file - so it is also easy to write a PowerBuilder application that can parse the trace information and summarize the transactional information for you.
An application performance profile can be done by using the Profiler feature of the PowerBuilder development environment. You may also control this in a production application by issuing specific tracing commands such as TraceOpen, TraceBegin, and TraceActivity to capture a performance model of the running application. Since we are focusing on packetization and the data movement activity through a network mechanism, you probably want to restrict the performance tracing to SQL activity. This is easily done by specifying the "Embedded SQL" only in the PowerBuilder development environment System Options dialog for profiling (see Figure 8). You can also use a tracing command to set the TraceActivity focus to SQL as well. Either way, the runtime PB system will create an SQL-based "Profile Trace" file for you to analyze.
In the case of profiling from the PB development environment, once you have set the Profiling Options in the Systems Options dialogue - just run the application as you would normally to gather the performance intelligence information for the various SQL activities.
Once you close the application and return to the PowerBuilder development environment, you can launch the Profile Analysis tool that comes with PowerBuilder. This can be done by using the "File=>New" menu and selecting the "Tool" tab page in the resulting dialog. In that tab page should be options for launching three profile analysis tools called "Profiling Class View", "Profiling Routine View" and "Profiling Trace View". These utilities will also allow you to graph the results of the Profile Trace - in this article we are focused on the SQL activity of a given application. Running this profile analysis for the default packet size connection will give you a baseline measurement. Then, rerun the same test changing the packet size on the next and subsequent tests using the identical functionality. The comparison of these profiles should show you the IO and CPU performance improvement or degradation for each packet size compared to the base line.
In Figure 9 and Figure 10, we can see the graphed difference of a 1,024 byte packet size (Figure 9) versus a 2,048 byte packet (Figure 10). The dark blue line indicates the overhead of the DB driver, which is apparent in the 1,204 byte test but drops to zero (negligible) in the 2,048 byte test. The green line indicates the application's overhead to process the data and climbs in the larger packet size test but the involuntary wait time (depicted in red) is reduced. Interestingly enough, the CPU was higher on the client side to handle the larger packet size overhead but the overall SQL performance throughput dropped from an average of 979ms to 911ms - a gain of 7% better performance over all.
Examples of profile graphs of 1,024 vs 2,048 byte packet size tests
The DataWindow control is an excellent place to monitor response timings for any DML activity interaction with the DBMS server such as Stored Procedures or Stored Function calls (see Figure 11). This not only includes retrieval operations but also Update, or even Refreshing data activity. For Select requests we have the RetrieveStart, and RetrieveEnd events and for Insert, Update, and Delete SQL activity we have the UpdateStart and UpdateEnd events. The key to the start and end paired events is the ability to time the duration of each DBMS conversation. We can also add to this equation the SQLPreview event that allows the developer to differentiate the actual DML statement being processed. Now you can time the overhead of each specific statement down to the "where" and "ordered by" sub-clause variances.
Once we have the problem application set up to analyze a set of DML statements that represent the various normal SQL processings, we can run this controlled test multiple times. For each test iteration, we can alter the packet size and capture the performance of each of the DataWindow's events for each packet size. Comparing this against the base line we can then compute the various gains or losses.
To save yourself some time in enabling this performance gathering feature you can just utilize the (Software Tool & Die) STD Foundation Classes for PocketBuilder, InfoMaker, EAServer or PowerBuilder as the DataWindow event performance capture facility has already been implemented within the framework. You can download the appropriate product version located on the Sybase website's CodeXchange area or visit the new Foundation Classes home. For the PowerBuilder 11.5 version of the FCs, the new Transaction Object events have also been implemented to capture SQL and timing information.
Changing the Packet Size
I should mention that its is very easy for have your PowerBuilder applications negotiate a different packet size with the DBMS server via its client by just adding the "PacketSize=nnnn" parameter to the DBParm field of the Transaction Object (aka SQLCA). This parameter takes effect on the CONNECT statement and will negotiate the new packet size at that time. For other DBMS vendors, you will have to check your Programmers Manual to see if their respective client software will recognize a network altering packet size parameter. Microsoft's SNC driver for SQLServer and SQLAnywhere's database drivers, for example, do recognize the "PacketSize" DBParm value. Or, to be more specific, the DBParm parameter is only supported for the ASE, DIR, SNC, SYC and ODBC drivers. You would only need to check with your database documentation if you are using the ODBC driver for that vendor. The other PB database drivers (Oracle, Informix, etc.) do not support it, so there is no need to check with the vendor. Those databases may allow setting the packet size at the client layer, but most likely not on an application (or connection) specific basis.
Application / Web Server
You may deploy PowerBuilder objects to Sybase's Enterprise Application Server (aka EAServer) to be exposed as either a CORBA or web service component (see Figure 12). If you have a newer version of PowerBuilder, you may also deploy your component as a .NET web service object housed inside of Microsoft's IIs (Internet Information Server) server. Either way...if these middle tier-based components connect to a DBMS, then right-sizing your network packets here may also benefit the overall performance in these environments as well. In fact, the packet size chosen here may even be more critical as the volume of database transactions can be much greater, considering the amount of requests (sessions) communicating with these components.
Again, even though the PowerBuilder components are housed within a server, the SQL trace, Profiler commands and DataWindow Events pertaining to SQL activity are still relevant here to help you capture performance information. The key is to keep the testing model static and controlled while varying only the packet size to ensure a proper comparison as you progress through the regression test suite.
With all the testing that I did, I was able to easily load the captured timing, CPU, and performance results into a homemade database that I designed and programmed using Sybase's InfoMaker product. InfoMaker easily allowed me to load the database and then design comparison reports. Even create graphs in minutes that clearly show where the performance gains and losses would occur (see Figure 13). Interestingly enough, even small changes to packet size often significantly varied the performance model from two aspects: processor cycles used and packet response time.
The processor time I found was directly related to the packaging and un-packaging of the data into and out of the network packets. By varying the packet size, you can directly influence the overhead on the workstation and the server surrounding this activity. Setting a reasonable packet size can drop the processor overhead from a modest 20% to savings topping over 40%. This drop in overhead was especially noticed on the DBMS server side as the concurrent session load increased. Freeing processing cycles here benefited all applications by facilitating better concurrent activity and faster overall response time at higher load levels.
From the network packet transport layer, I was able to observe modest to significant improvements in performance as the packet sizes changed due to transmission efficiency, data packaging effectiveness, and acknowledgement packet overhead (see Figure 14). These factors can certainly influence the overall throughput in communicating to your DBMS server and back again to the application.
In my specific application and DBMS profiling, the overall optimum size of the network packets was best optimized at the 7168 byte level. Performance gains at this size were very noticeable in the various client / server, service object and web applications compared to the default 512 byte level that these systems were originally using. The next best gains occurred at the 7,168, 3,564 and 2,048 byte levels. However, it is important to note that my database transaction mix was a combination of many small resultset queries combined with sporadic large queries or Blob (Binary Large Object) requests.
I would like to offer the following recommendations, observations and suggestions when setting your network packet size values:
- Do not arbitrarily set the packet sizes too high as it can consume excessive resources on the network and within the DBMS server (for example, continuous high memory demands).
- Find a default network packet size that best suites most application systems and DBMS settings and set that default in your DBMS server.
- Acquire network and DBMS tools that easily allow you to introspect various areas of performance concern.
- Each application should set its own packet size to be negotiated at "Connect" time that best optimizes its database transaction processing characteristics.
- Do not be afraid to experiment with various packet sizes.
- Make sure that your project plans include some performance tuning tasks.
- Make sure you involve your Network Administration and Database Administration groups in your tuning exercises.
- Use the "Law of Diminishing Returns" to only work on changes that return you the best performance gains with the least amount of effort.
- Do not be afraid to contact your DBMS vendor and ask them for suggestions.
- If you have development tools like PowerBuilder, tap into their hidden resource features such as profiling and tracing to get a good picture of what is really happening "under the hood" of your applications.
- Contact your local Sybase user group and ask other members about their performance-tuning experiences and seek their suggestions for incorporation
- Do not be afraid to create two or three SQLCA communication objects and conversations with the DBMS within an application where each connection sets an optimized packet size for the specific transaction mix it anticipates.
In the end, I ended up changing the default Transaction Object (SQLCA) to run with a 2,048 byte packet size for all the small to medium queries - then creating a second Transaction Object set to the 7,168 byte packet size for Blob-related activity (see Figure 15). These changes improved my EAServer environment performance by reducing the CPU load and response times by 21%. I was able to change the client/server based applications as well to standardize on a 4,096 packet size that improved the average transaction response time with the DBMS by 11%.
Taking a little time to include these types of tuning exercises in your SDLC (System Development Life Cycle) or making a focused effort once a year in this area can exponentially provide you with great performance gains and ensure that you meet SLA (Service Level Agreements) with your clientele. Most often it's not one specific change to an application system that affects the best overall performance but a sequence of changes - like the network packet size - that assist in achieving full performance expectations.
I hope that this article has inspired you to look into this area further. I also wanted to help remove some of the mystery about network operations that we often take for granted. Now that you know how you can effect some more control over the network realm, all you need to do is to educate your management as to the possible benefits of this type of endeavor.
I am also glad to report that no packets or animals were harmed in the making of this article.
- Where Are RIA Technologies Headed in 2008?
- PowerBuilder History - How Did It Evolve?
- Creation and Consumption of Web Services with PowerBuilder
- Cloud People: A Who's Who of Cloud Computing
- DDDW Tips and Tricks
- Cloud Expo 2011 East To Attract 10,000 Delegates and 200 Exhibitors
- Working with SOA & Web Services in PowerBuilder
- Dynamically Creating DataWindow Objects
- OLE - Extending the Capabilities of PowerBuilder
- Cloud Expo, Inc. Announces Cloud Expo 2011 New York Venue