Feature
Enterprise Development
Some thoughts
Jan. 9, 2004 12:00 PM
Your company decides to implement a mobile application initiative and has chosen which database, handheld, and software packages to use.
A new year has begun and new projects will be started, so now is the time to think about what we are doing, what our jobs are, and how we understand enterprise computing.
Many people write computer software and we all call it "software development." Each type of software and development project has complex issues and challenges that need to be solved by project managers and developers. Just look around and talk with some of your friends or people in another department and ask them what they are currently doing and what kind of software they are dealing with. You'll frequently hear the term "enterprise application." You may also have seen the term within the PowerBuilder documentation, or you already own a so-called "PowerBuilder Enterprise" license.
What Is It?
What do I mean by the term "enterprise application"? I can't give you a precise definition, but I can provide some indication of what I mean. Enterprise applications include payroll, patient records, shipping, tracking, cost analysis, credit scoring, insurance, supply chain, accounting, customer service, and foreign exchange trading; they do not include automobile fuel injection, word processors, elevator controllers, chemical plant controllers, telephone switches, operating systems, compilers, and (though I like them a lot) games.
Enterprise applications often have lots of complex data that people need to work on, including business rules that sometimes no one really knows why they are (still) around (you know, the kind of stuff that grows over the years, don't you?). Most people use the term enterprise application to describe a large system. It's important to understand that not all enterprise applications are necessarily large (even though they might provide a lot of value to the enterprise) or have a lot of data. Many people assume that small systems aren't worth bothering with (because they aren't large), which could be a major problem within the company - if a small system fails, it usually makes less noise than a big system, but sooner or later it might break one of the bigger (more recognized) systems. Also, this thinking will infect your whole IT department, because if you can do things that improve small projects, the cumulative effect can be very significant on an enterprise.
Prior to Development
Persistence
Enterprise applications usually deal with persistent data. The data needs to be around between multiple runs of the program, or it needs to be available for several years (for example, on my current customer site we have to be able to generate reports from several years back). During this long time period (for the IT industry, not for a dinosaur) there will be many changes in the programs that use this data. You'll see that the hardware currently in use worked fine in the past, but after a few years is no longer good enough. In addition, your operating systems and compilers become outdated (there may be new techniques available in the interim).
During that time there will be many changes to the structure of the data in order to integrate new business logic and to store new pieces of information. Most of the time this needs to be done without disturbing the old pieces (for example, our reports have to look the same as they did some years ago for legal reasons). If there's a fundamental change and the company installs a completely new application to handle a job, the data has to be migrated to the new application.
A Lot of Data
There's usually a lot of data; a moderate system will have more gigabytes of data, maybe organized in different databases, in millions of records, so much so that managing it is a major part of the system. Older systems (host) use indexed files; newer systems usually use databases, mostly relational databases (Sybase ASE) or a data warehouse (Sybase IQ) to store the huge amount of data. Believe me when I say that the design and feeding of these databases has turned into a subprofession of its own.
Concurrency
Usually, numerous people are accessing the data concurrently. For most of the systems out there, this may be less than a hundred people, but for distributed systems (intranet or Internet applications) this increases and during development you might not know how many users you have to deal with. With so many people it's important that they have proper access to this system (at the same time). But even without that many people, there are still problems in making sure that two people don't access the same data at the same time in a way that causes errors (you remember the properties for the "Where Clause for Update/Delete" within the DataWindow painter, right?). Transaction manager tools (EAServer has one built in) handle some of this, but often it's impossible to hide this completely from application developers (as an EAServer developer you've seen this already).
User Interface
With so much data, there's usually a lot of user interface screens to handle to support your users. It's not unusual within the enterprise to have hundreds of different windows. Take care as users of enterprise applications have different experiences with computer programs. Most of the time they will have little technical expertise. For this reason the data has to be presented in lots of different ways, all of them easily accessible, for different purposes. These systems very often have batch processing involved, which might be easily forgotten when focusing on use cases that stress user interaction.
Integration
Enterprise applications rarely live in isolation. There is the need to integrate with other enterprise applications spread around the enterprise company. These different systems were built from different companies or departments at various times (maybe a long time ago) with an assortment of technologies, and even the collaboration mechanisms will not be the same: different file formats, different mailing systems, different platforms, different reporting systems, and so on. The enterprise must try to integrate these already existing, different systems using a common communication technology, but that's not quite so easy and it will hardly ever finish the job. As a result, there are several different unified integration schemes in place at once (this may be for a variety of reasons, not only that this might not be accomplished by one vendor, since frequently there is politics involved). This gets even worse as businesses seek to integrate with their business partners as well (for example, ebXML).
Business Logic
If you are a developer, business rules are just given to you, and sometimes there is nothing you can do to change them. You have to deal with some strange conditions that often interact with each other in surprising ways. Of course, they often exist for a specific reason and a few thousand of these one-off special cases are what leads to complex businesses. This is what makes our business, the software business, so difficult, but also so interesting. As a developer in this situation, you have to organize the business logic as effectively and as simply as you can, because the only certain thing is that the logic you are just implementing will change sooner or later.
Many architectural decisions are about performance. For most performance issues I prefer to get a system up and running, instrument it, and then use a disciplined optimization process based on measurement. However, some architectural decisions affect performance in a way that's difficult to fix with later optimization. Even when it is easy to fix, people involved in the project worry about these decisions early. One problem regarding performance is that many terms are used in an inconsistent way. The most noted victim of this is "scalability," which is regularly used to mean half a dozen different things. Here are the terms I use:
- Response time: The amount of time it takes for the system to process a request from the outside. This may be a user interface action, such as pressing a button, or an API call.
- Responsiveness: How quickly the system acknowledges a request as opposed to processing it. This is important in many systems because users may become frustrated if a system's responsiveness is low, even if its response time is good. If your system waits during the whole request, your responsiveness and response time are the same. However, if you indicate that you've received the request before you finish, then your responsiveness is better. Providing a progress bar during a file copy improves the responsiveness of your user interface, even though it doesn't improve response time.
- Latency: The minimum time required to get any form of response, even if the work to be done is nonexistent. It's usually the big issue in remote systems. If I ask a program to do nothing and to tell me when it's done doing nothing, I should get an almost instantaneous response if the program runs on my laptop. However, if the program runs on a remote computer, it may take a few seconds just because of the time needed for the request and response to make their way across the wire. As an application developer, I can usually do nothing to improve latency. Latency is also the reason why you should minimize remote calls.
- Throughput: How much stuff you can do in a given amount of time. If you're timing the copying of a file, throughput might be measured in bytes per second. For enterprise applications, a typical measure is transactions per second (tps), but the problem is that this depends on the complexity of your transaction. For your particular system, you should pick a common set of transactions. In this terminology, performance is either throughput or response time, whichever matters more to you. It can sometimes be difficult to talk about performance when a technique improves throughput but decreases response time, so it's best to use the more precise term. From a user's perspective, responsiveness may be more important than response time, so improving responsiveness at the cost of response time or throughput will increase performance.
- Load: Statement of how much stress a system is under, which can be measured in how many users are currently connected to it. The load is usually a context for some other measurement, such as a response time. Thus, you may say that the response time for some request is 0.5 seconds with 10 users, and 2 seconds with 20 users.
- Load sensitivity: An expression of how the response time varies with the load. Let's say that system A has a response time of 0.5 seconds for 10-20 users and system B has a response time of 0.2 seconds for 10 users that rises to 2 seconds for 20 users. In this case system A has a lower load sensitivity than system B. We might also use the term degradation to say that system B degrades more than system A.
- Efficiency: Performance divided by resources. A system that gets 30 tps on two CPUs is more efficient than a system that gets 40 tps on four identical CPUs.
- Capacity of a system: An indication of its maximum effective throughput or load. This might be an absolute maximum or a point at which the performance dips below an acceptable threshold.
- Scalability: A measure of how adding resources (usually hardware) affects performance. A scalable system is one that allows you to add hardware and get a commensurate performance improvement, such as doubling the number of servers you have to double your throughput. Vertical scalability, or scaling up, means adding more power to a single server, such as more memory. Horizontal scalability, or scaling out, means adding more servers.
I would like to give you one tip when building enterprise systems. It often makes sense to build for hardware scalability rather than capacity or even efficiency. Scalability gives you the option of better performance if you need it. Scalability can also be easier to do. Often designers do complicated things to improve the capacity of a particular hardware platform when it might actually be cheaper to buy more hardware. It's fashionable to complain about having to rely on better hardware to make our software run properly, and I join this choir whenever I have to upgrade my laptop just to handle the latest version of Word. But newer hardware is often cheaper than making software run on less powerful systems.
About Berndt HamboeckBerndt Hamboeck is a senior consultant for BHITCON (www.bhitcon.net). He's a CSI, SCAPC8, EASAC, SCJP2, and started his Sybase development using PB5. You can reach him under admin@bhitcon.net.