What are the most common performance and scalability problems for a J2EE (Java EE) Web application? Here are the most common tips and problems found in real production systems after working years with clients in trouble shooting their application systems.
1. Bad Caching Strategy
It is rare that users require absolutely real time information. Simply refreshing HTML content with a 60 second cache can already dramatically reduce the load to the application server and most important the DB for a high traffic web site. Cache HTML segment for the home page and most visited pages. Implement other caching strategy in the business service layer or the DB layer. For example, use Spring AOP to cache data returned from a business service or configure hibernate to cache DB query result.
2. Missing DB indexes
After a new code push, indexes may be missing for the new SQL codes. The data query may be slow if the table is huge and the missing index forces a full table scan. Most development DB has a very small data set and therefore the problem is un-detected. Check the DB log or profile in production for long executed SQLs and add index if needed.
3. Bad SQLs
The second most common DB performance problem is bad SQLs. Check the DB log or profile for long executed query. Most problems can be resolved by re-written the SQLs. Paid attentions to sub-query or SQLs with complicated joins. Occasionally, DB table tuning may be required.
4. Too many fine grain calls to the service, data or the DB layer
Developers may use an iteration loop in retrieving a list of data. Each iteration may make a middle tier call which results in multiple SQL calls. If the list is long, the total DB requests can be huge. Developers should write a new service call and retrieve the list in a single DB call.
5. All application server threads are waiting for the DB or external system connection
Web server has a limited number of threads. When a HTTP request is processed, a thread will exclusively dedicate to a request until it is completed. Hence, if an external system like DB is very slow, all web server threads may be waiting. When this happens, the web server will pause all new incoming requests. From a end user perspective, the system seems not responding. Add timeout logic when communicate with external system. Increasing the thread counts will only delay the problem and in some cases counter productive.
6. SQLs retrieve too many rows of data
Do not retrive hundreds row of data to just display a few of them. Check the DB log or profile constantly for un-expected high usage of SQLs that retrieve a lot of rows .
7. Do not use prepared statement for the DB
Always use prepared statement to avoid DB side SQL hard parsing. SQL hard parsing causes a lot of DB scalability problem when DB requests increases.
8. Lack or improper pagination of data
Implement pagination to display a long list of data. Do not retrieve all the data from the database and use the Java code to filter out the data. Always use the database for data filtering and pagination.
9. Non-optimize connection pool configuration
The maximum/minimum pool size and the retaining policy of idling pool thread can significant impact an application performance. The web server will be idle waiting for a DB connection if the pool size is too low. The retaining policy is important since most DB pool creation code has very low concurrency and cannot handle a sudden surge of concurrent requests.
10. Frequent garbage collection caused by memory leak
When memory is leaking, the Java JVM will perform frequent garbage collection (GC) even they cannot reclaim too many memory. Eventually, the web server spend most of the time executing the GC rather than processing HTTP requests. Rebooting the server can temporarily release the problem but only stopping the leak can solve the problem.
11. Do not process large amount of data at once
For request involving large amount of data, in particular batch process, sub-divide the large data set into chunk and process it separately. Otherwise, the request may deplete the Java heap or stack memory and crashes the JVM.
12. Concurrency problems in the synchronization block
Code synchronization block carefully. Use established library to manage system and application resources like DB connection pool. For system with concurrency problem, the CPU utilization remains low even significantly increase the traffic.
13. Bad DB tuning
If DB response is slow regardless of SQLs, DB instance tuning is needed. Monitor the memory paging activity closely in identifying any memory mis-configuration. Also monitor the file I/O wait time and DB memory usage closely.
14. Process data in batch
To reduce DB requests, combine DB requests together and process those in a single batch. Use SQL batch if necessary instead of large volume of small SQL requests.
15. JMS or application deadlock
Avoid a cyclic loop in making JMS requests. A request may send to Queue A which then send a message to Queue B and then again to Queue A. This circle loop will trigger deadlock in high volume requests.
16. Bad Java heap configuration
Configure the maximum heap size, the minimum heap size, the young generation heap and the garbage collection algorithm correctly. The bigger is not the better and it is often depends on the application.
17. Bad application server thread configuration
Too high of a thread count triggers high context switching overhead while low thread count causes low concurrency. Tuning it according to the application needs and behavior. Configure the connection pool thread count according to the amount of thread count.
18. Internal bugs in the third party libraries or the application server
If new third party libraries are added to the application, monitor any concurrency and memory leak issue closely.
19. Out of file descriptors
If the application does not close file or network resources correctly in particular within exception handling, the application may ran out of file descriptors and stop processing new requests.
20. Infinite loop in the application code
An iteration loop may run into an infinite loop and trigger high CPU utilization. It can be data sensitive and happen to a small set of traffic. If the CPU utilization remains high during low traffic time, monitor the thread closely.
21. Wrong firewall configuration
Some firewall configuration limits the amount of concurrent access from a single IP. This can be problematic if a web server is connected to another DB server through a firewall.
Verify the firewall configuration if the application achieves much higher concurrency if tested within in a local network.
22. Bad TCP tuning
In-proper TCP tuning causes un-resonable high amount of socket waiting to be closed (TIME_WAIT). New version of OS is usually tuned correctly for Web server. Make changes to the default TCP tuning parameters only if needed. Direct TCP programming may sometimes need special programming parameters for short but frequent TCP messages.