Digital (dis)content: Spring conference in Paris, part 2

[Updated on 2009/01/09] Videos are available here.

Part 1 here.

Here's the transcript of Mark Thomas' presentation on Tomcat optimization & performance tuning.

The process

Understand the system architecture
Stabilize the system (no one should be messing with it while investigation is in progress)
Set performance targets
- All requests under 3 seconds ?
- This type of request under 3 seconds, this one under 10 ?
- 90% of requests under 3 seconds, 100% under 10 ?
Measure current performance
- ab
- Jmeter
Identify the current bottleneck
Fix the root cause of the bottleneck, not the symptom, e.g. if your systems runs out of memory, don't add extra memory : find where the memory consumption occurs
Repeat the whole process until you meet the performance target

Common errors

Optimizing code that doesn't need it
Insufficient testing
- realistic data volumes
- realistic user load
Lack of clear performance targets
Guessing where the bottleneck is
Fixing the symptom rather than the cause

Tuning options

Applications typically account for over 80% of a request processing time, so look at your application first.
Tomcat logs are also a good place to start, as their default configuration is too generic
- catch-all logger writes both to file and to stdout... but Linux redirects stdout to a file.
- The catch-all file has no overflow protection. It will just grow and grow.
- Logging is synchronous, which shouldn't be a problem for local disks but it does add overhead if logs are written through the network (NAS).
- Solutions :
  - Remove logging to stdout, i.e. remove java.util.logging.ConsoleHandler from the handlers list in the logging.properties file
  - Add log rotation to the catch-all logger
  - Asynchronous logging has not been implemented yet...

Connections

First, you need to understand your application usage patterns
- One request every now and then ?
- Short bursts of requests ?
- One request every 3 seconds ?
TCP/HTTP/SSL connections
- TCP connection setup is expensive, especially over WANs with high latency
  - HTTP keep-alives allow a TCP connection to be kept open and reused for other HTTP requests
- SSL connection setup is very expensive
  - HTTP keep-alives are a must if SSL is heavily used

Connectors

3 connectors are available
- Java Blocking I/O (BIO)
  - oldest one
  - most stable
  - JSSE-based SSL implementation : very slow
- Java Non-Blocking I/O
  - JSSE-based SSL implementation : very slow
- Native (APR)
Picking the right connector for a given requirement (from best to worst)
- Stability : BIO, APR, NIO
- SSL : APR, NIO, BIO
- Low concurrency : BIO, APR, NIO
- High concurrency, no keep-alives : BIO, APR, NIO
- High concurrency, keep-alives : APR, NIO, BIO
Why use NIO at all ? It never comes first !
- APR is unstable on Solaris
- NIO is a pure Java solution
- Switching between BIO and NIO is straightforward
  - Same configuration files
  - Same certificates

Tuning

maxThreads
- Maximum number of threads servicing HTTP requests
  - For BIO, this value really is the maximum number of concurrent client requests
- Typical value : 200-800
- 400 is a good starting point, which may be adjusted depending on CPU load
maxKeepAliveRequests
- Maximum number of concurrent HTTP requests per TCP connection
  - 1 : no keep-alives
  - Typical value is 100
connectionTimeout
- Typical value : 3000 ms
- Also used for keep-alive timeout
- Increase for :
  - slow clients, such as mobile phones
  - layer 7 load balancer with keep-alives
- Decrease for faster timeouts (!)
Content cache
- element
  - cacheMaxSize : typical value is 10240 Kb
  - cacheTTL : typical value is 5 s
- NIO/APR can use sendfile for large static files
  - OS “bypass”
  - File sent by a different thread, which doesn't tie up a HTTP request processing thread

JVM tuning

Memory
- Xms / Xmx flags
  - Used to define the size of the Java heap
  - Aim to set as low as possible
  - Setting the heap too high wastes memory and can cause long GC pauses
- XX:NewSize / XX:NewRatio
  - Set to 25/33% of total Java heap
  - Setting the ratio too high/low leads to inefficient GC
Garbage collection
- GC pauses the application
- From milliseconds to seconds
- XX:MaxGCPauseMillis / XX:MaxGCMinorPauseMillis
  - these set goals, which are not guaranteed
  - they lead to more frequent, shorter pauses
More on blogs.sun.com/watt/resource/jvm-options-list.html

Load balancing

Basic configuration : 1 httpd, 2 Tomcat instances, mod-proxy-http or mod-jk
Stateless requests
- they are routed purely according to load balancing algorithm
- this doesn't allow HTTP sessions
HTTP sessions
- sticky sessions must be setup, i.e. all HTTP requests from a single session must go to the same Tomcat instance
- this is performed by appending a session cookie to the HTTP requests, which will be tracked by the load balancer

Failover

Session replication between Tomcat instances can be added through clustering
Replication is asynchronous by default (done once the answer has been sent back to the client
Single line configuration by default.
Additional configuration needed for real-life production
- Automatic discovery of new Tomcat instances (through IP multicast)
- Synchronous replication (watch out for the performance impact)
- Session replication to a specific node (instead of all nodes)

Hints & tips

Use a minimum of 3 Tomcat instances
Test your application with load balancing and clustering before going to production (it may not behave the same)
Redeployment can cause memory leaks
- Include it in your testing
- Safer option : do a stop/start upgrade on each individual Tomcat instance in your clustering

Digital (dis)content

Nov 13, 2008

Spring conference in Paris, part 2

No comments:

Post a Comment