Question

How to reduce java concurrent mode failure and excessive gc

In Java, the concurrent mode failure means that the concurrent collector failed to free up enough memory space form tenured and permanent gen and has to give up and let the full stop-the-world gc kicks in. The end result could be very expensive.

I understand this concept but never had a good comprehensive understanding of
A) what could cause a concurrent mode failure and
B) what's the solution?.

This sort of unclearness leads me to write/debug code without much of hints in mind and often has to shop around those performance flags from Foo to Bar without particular reasons, just have to try.

I'd like to learn from developers here how your experience is? If you had encountered such performance issue, what was the cause and how you addressed it?

If you have coding recommendations, please don't be too general. Thanks!

 45  42777  45
1 Jan 1970

Solution

 24

The first thing about CMS that I have learned is it needs more memory than the other collectors, about 25 to 50% more is a good starting point. This helps you avoid fragmentation, since CMS does not do any compaction like the stop the world collectors would. Second, do things that help the garbage collector; Integer.valueOf instead of new Integer, get rid of anonymous classes, make sure inner classes are not accessing inaccessible things (private in the outer class) stuff like that. The less garbage the better. FindBugs and not ignoring warnings will help a lot with this.

As far as tuning, I have found that you need to try several things:

-XX:+UseConcMarkSweepGC

Tells JVM to use CMS in tenured gen.

Fix the size of your heap: -Xmx2048m -Xms2048m This prevents GC from having to do things like grow and shrink the heap.

-XX:+UseParNewGC

use parallel instead of serial collection in the young generation. This will speed up your minor collections, especially if you have a very large young gen configured. A large young generation is generally good, but don't go more than half of the old gen size.

-XX:ParallelCMSThreads=X

set the number of threads that CMS will use when it is doing things that can be done in parallel.

-XX:+CMSParallelRemarkEnabled remark is serial by default, this can speed you up.

-XX:+CMSIncrementalMode allows application to run more by pasuing GC between phases

-XX:+CMSIncrementalPacing allows JVM to figure change how often it collects over time

-XX:CMSIncrementalDutyCycleMin=X Minimm amount of time spent doing GC

-XX:CMSIncrementalDutyCycle=X Start by doing GC this % of the time

-XX:CMSIncrementalSafetyFactor=X

I have found that you can get generally low pause times if you set it up so that it is basically always collecting. Since most of the work is done in parallel, you end up with basically regular predictable pauses.

-XX:CMSFullGCsBeforeCompaction=1

This one is very important. It tells the CMS collector to always complete the collection before it starts a new one. Without this, you can run into the situation where it throws a bunch of work away and starts again.

-XX:+CMSClassUnloadingEnabled

By default, CMS will let your PermGen grow till it kills your app a few weeks from now. This stops that. Your PermGen would only be growing though if you make use of Reflection, or are misusing String.intern, or doing something bad with a class loader, or a few other things.

Survivor ratio and tenuring theshold can also be played with, depending on if you have long or short lived objects, and how much object copying between survivor spaces you can live with. If you know all your objects are going to stick around, you can configure zero sized survivor spaces, and anything that survives one young gen collection will be immediately tenured.

2011-10-19

Solution

 12

Quoted from "Understanding Concurrent Mark Sweep Garbage Collector Logs"

The concurrent mode failure can either be avoided by increasing the tenured generation size or initiating the CMS collection at a lesser heap occupancy by setting CMSInitiatingOccupancyFraction to a lower value

However, if there is really a memory leak in your application, you're just buying time.

If you need fast restart and recovery and prefer a 'die fast' approach I would suggest not using CMS at all. I would stick with '-XX:+UseParallelGC'.

From "Garbage Collector Ergonomics"

The parallel garbage collector (UseParallelGC) throws an out-of-memory exception if an excessive amount of time is being spent collecting a small amount of the heap. To avoid this exception, you can increase the size of the heap. You can also set the parameters -XX:GCTimeLimit=time-limit and -XX:GCHeapFreeLimit=space-limit

2010-05-27