Oh, and one more thing: there is a severe performance penalty for using finalizers.
On my machine, the time to create and destroy a simple object is about 5.6 ns. Adding a finalizer increases the time to 2,400 ns.
In other words, it is about 430 times slower to create and destroy objects with finalizers.
——Effective Java Second Edition
很好奇大佬是如何测量时间的。
创建一个简单对象还比较容易精确测量,销毁一个对象的时间就没那么容易精确测量了。因为JVM屏蔽了底层细节,程序员并不知道它什么时候真正的销毁对象并回收资源。你可以提建议,但什么时候听就看GC心情了
想了一下,大概有这两条路:
实验环境Win8.1:
首先不能依赖finalize()
方法来打卡时间点,没有这个方法的简单对象表示臣妾~做不到啊
后来刚好看到java.lang.ref
包,PhantomReference
代表最接近生命周期终点的phantom reachable,当GC认为referent,也就是我们的简单对象,变成phantom reachable的时候,它就会将该reference添加到queue中。
由于在Java语言里和GC的交互程度非常有限,那么我们就以这个时间点作为销毁对象的打卡时间点。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class WithoutFinalizationV1 {
public static void main(String[] args) throws Exception {
ReferenceQueue<WithoutFinalizationV1> phantomQueue = new ReferenceQueue<>();
long start = System.nanoTime();
PhantomReference<WithoutFinalizationV1> rf = new PhantomReference<>(
new WithoutFinalizationV1(), phantomQueue);
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
long time = System.nanoTime() - start;
System.out.println("WithoutV1 "+time);
}
}
在首次尝试中,创建和销毁的时间合在一起测量,结果也比较出乎意料,完全就不在一个数量级啊。后面想了想,因为包含了GC可能的磨蹭时间,这个时间也在情理之中。
运行了几次,最好记录是:5248174 ns
下面给类加上finalize()
方法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public class WithFinalizationV1 {
private static Object lock = new Object();
private static boolean finalized = false;
@Override
protected void finalize() throws Throwable{
synchronized(lock){
finalized = true;
lock.notify();
}
}
public static void main(String[] args) throws Exception {
ReferenceQueue<WithFinalizationV1> phantomQueue = new ReferenceQueue<>();
long start = System.nanoTime();
PhantomReference<WithFinalizationV1> rf = new PhantomReference<>(
new WithFinalizationV1(), phantomQueue);
System.gc();//advise JVM to do GC
synchronized(lock){
while(!finalized){
lock.wait();
}
}
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
long time = System.nanoTime() - start;
System.out.println("WithV1 "+time);
}
}
需要注意的是,finalize()
的方法体不能为空白,否则就不算finalizable类。
首先简要说明下finalization的流程:当构造finalizable对象的时候,JVM把这个对象包装成java.lang.ref.Finalizer
,这个对象是一个双向链表。GC后,JVM把Finalizer
对象放入它的queue里,如果是第一次调用,FinalizerThread
先把该对象从双向链表里剥离,然后调用它的finalize()
方法。
这个时候,我们的finalizable对象就是漂浮的垃圾了,在下次GC的时候进入到我们的phantomQueue
里。
下次调用gc()
的时机通过wait-notify机制实现,这样就不可避免有额外的开销。最好记录是11789575 ns
,耗时大概是非finalizable版本的两倍。
为了更贴近一点,我们分开测量创建和销毁时间。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
public class WithoutFinalizationV2 {
public static void main(String[] args) throws Exception {
long createTime = System.nanoTime();
WithoutFinalizationV2 o = new WithoutFinalizationV2();
createTime = System.nanoTime() - createTime;
ReferenceQueue<WithoutFinalizationV2> phantomQueue = new ReferenceQueue<>();
PhantomReference<WithoutFinalizationV2> rf = new PhantomReference<>(
o, phantomQueue);
o = null;//explicitly clear reference
long destroyTime = System.nanoTime();
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
destroyTime = System.nanoTime() - destroyTime;
System.out.println("WithoutV2 "
+" "+createTime +" "+ destroyTime);
}
}
这次的结果稍微精确点儿,但也差不太多:5987+5291795=5297782 ns
有finalize()
方法的:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
public class WithFinalizationV2 {
private static Object lock = new Object();
private static boolean finalized = false;
@Override
protected void finalize() throws Throwable{
synchronized(lock){
finalized = true;
lock.notify();
}
}
public static void main(String[] args) throws Exception {
ReferenceQueue<WithFinalizationV2> phantomQueue = new ReferenceQueue<>();
long createTime;
long destroyTime;
long createStart = System.nanoTime();
WithFinalizationV2 o = new WithFinalizationV2();
createTime = System.nanoTime() - createStart;
PhantomReference<WithFinalizationV2> rf = new PhantomReference<>(o, phantomQueue);
o = null;//explicitly clear reference
long destroyStart = System.nanoTime();
System.gc();//advise JVM to do GC
synchronized(lock){
while(!finalized){
lock.wait();
}
}
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
destroyTime = System.nanoTime() - destroyStart;
System.out.println("WithV2 "
+" "+createTime +" "+ destroyTime);
}
}
最快结果是:6415+12250587=12257002 ns
,是非finalizable版本的2.5倍左右。
在程序上已经做到尽量贴近测量了,要想进一步提高成绩,只有利用JVM的HotSpot来优化性能。
在官方文档只找到了方法调用次数的阈值,Server VM是一万次调用。网上搜了下,OSR也差不多是这个次数。
这个版本采用先批量构造对象,然后挨个销毁,最后计算平均时间。由于需要循环很多次,我们能用多线程吗?我觉得不能,因为一次GC就可能回收多个对象,测量就更不准了。
然后又尝试运行在-Xcomp
模式,不幸的是程序会卡在等phantomQueue
入列的地方,具体原因就不清楚了,可能优化太激进了吧。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public class WithoutFinalizationV3 {
public static void main(String[] args) throws Exception {
int times = 30*1000;
long createTime = 0;
long destroyTime= 0;
WithoutFinalizationV3[] oarray = new WithoutFinalizationV3[times];
for(int i=0; i<times; i++){//On-Stack Replacement
long createStart = System.nanoTime();
WithoutFinalizationV3 o = new WithoutFinalizationV3();
createTime += System.nanoTime() - createStart;
oarray[i] = o;
}
for(int i=0; i<times; i++){//On-Stack Replacement
ReferenceQueue<WithoutFinalizationV3> phantomQueue = new ReferenceQueue<>();
PhantomReference<WithoutFinalizationV3> rf = new PhantomReference<>(
oarray[i], phantomQueue);
oarray[i] = null;//explicitly clear reference
long destroyStart = System.nanoTime();
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
destroyTime += System.nanoTime() - destroyStart;
//indicator that code is running instead of stuck
System.out.println("WithoutV3 "+i);
}
System.out.println("WithoutV3 createTime="+createTime/times);
System.out.println("WithoutV3 destroyTime="+destroyTime/times);
}
}
60+7094570=7094630 ns
,在大量对象中回收一个确实比较费时。
有finalize()
方法的:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
public class WithFinalizationV3 {
private static Object lock = new Object();
private static boolean finalized = false;
@Override
protected void finalize() throws Throwable{
synchronized(lock){
finalized = true;
lock.notify();
}
}
public static void main(String[] args) throws Exception {
int times = 30*1000;
long createTime = 0;
long destroyTime= 0;
WithFinalizationV3[] oarray = new WithFinalizationV3[times];
for(int i=0; i<times; i++){//On-Stack Replacement
long createStart = System.nanoTime();
WithFinalizationV3 o = new WithFinalizationV3();
createTime += System.nanoTime() - createStart;
oarray[i] = o;
}
for(int i=0; i<times; i++){//On-Stack Replacement
ReferenceQueue<WithFinalizationV3> phantomQueue = new ReferenceQueue<>();
PhantomReference<WithFinalizationV3> rf = new PhantomReference<>(
oarray[i], phantomQueue);
oarray[i] = null;//explicitly clear reference
long destroyStart = System.nanoTime();
System.gc();//advise JVM to do GC
synchronized(lock){
while(!finalized){
lock.wait();
}
}
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
destroyTime += System.nanoTime() - destroyStart;
finalized = false;
//indicator that code is running instead of stuck
System.out.println("WithV3 "+i);
}
System.out.println("WithV3 createTime="+createTime/times);
System.out.println("WithV3 destroyTime="+destroyTime/times);
}
}
运行这个版本的程序遇到个问题,当循环到一万五六的时候,程序卡在等待入列的地方Line 41
,就好像GC不再把对象放入phantomQueue
一样。
刚开始以为是时序的问题,因为我们能同步main
线程和FinalizerThread
,但无法同步FinalizerThread
和GC线程。如果第二次GC发生在FinalizerThread
做完工作之前,这个对象就不会进入phantomQueue了。
但我在第二次GC之前尝试调用了Thread.yield()
或Thread.sleep(100)
,还是同样的结果。
为什么会这样,难道说持续构建大量finalizable类的对象,GC就不回收了还没想到怎么查,有点儿束手无策了。
这个版本改成创建一个对象销毁一个的方式,避免批量创建过多对象。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public class WithoutFinalizationV4 {
public static void main(String[] args) throws Exception {
int times = 30*1000;
long createTime = 0;
long destroyTime = 0;
for(int i=0; i<times; i++){//On-Stack Replacement
long createStart = System.nanoTime();
WithoutFinalizationV4 o = new WithoutFinalizationV4();
createTime += System.nanoTime() - createStart;
ReferenceQueue<WithoutFinalizationV4> phantomQueue = new ReferenceQueue<>();
PhantomReference<WithoutFinalizationV4> rf = new PhantomReference<>(
o, phantomQueue);
o = null;//explicitly clear reference
long destroyStart = System.nanoTime();
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
destroyTime += System.nanoTime() - destroyStart;
//indicator that code is running instead of stuck
System.out.println("WithoutV4 "+i);
}
System.out.println("WithoutV4 createTime="+createTime/times);
System.out.println("WithoutV4 destroyTime="+destroyTime/times);
}
}
这次的成绩相对V2版的提高不少:545+2515831=2516376 ns
有finalize()
方法的:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
public class WithFinalizationV4 {
private static Object lock = new Object();
private static boolean finalized = false;
@Override
protected void finalize() throws Throwable{
synchronized(lock){
finalized = true;
lock.notify();
}
}
public static void main(String[] args) throws Exception {
int times = 30*1000;
long createTime = 0;
long destroyTime= 0;
for(int i=0; i<times; i++){//On-Stack Replacement
long createStart = System.nanoTime();
WithFinalizationV4 o = new WithFinalizationV4();
createTime += System.nanoTime() - createStart;
ReferenceQueue<WithFinalizationV4> phantomQueue = new ReferenceQueue<>();
PhantomReference<WithFinalizationV4> rf = new PhantomReference<>(o, phantomQueue);
o = null;//explicitly clear reference
long destroyStart = System.nanoTime();
System.gc();//advise JVM to do GC
synchronized(lock){
while(!finalized){
lock.wait();
}
}
System.gc();//advise JVM to do GC
phantomQueue.remove();//wait it to enqueue
//only need this time point
destroyTime += System.nanoTime() - destroyStart;
finalized = false;
//indicator that code is running instead of stuck
System.out.println("WithV4 "+i);
}
System.out.println("WithV4 createTime="+createTime/times);
System.out.println("WithV4 destroyTime="+destroyTime/times);
}
}
奇怪的是,这个版本的循环就能跑完结果是:3829+4378470=4382299 ns
甩开一切包袱,加上所有Buff,看能跑多快!以这个版本为蓝本,先注释掉销毁对象的部分,然后运行的时候加-Xcomp
参数,测量了一下创建一个简单对象最快需要多少时间,得到的最好结果是21 ns
finalizable类的实例需要额外的一道手续才能被回收,显然比回收普通类的实例更加耗时。
如果你有更好的办法,请留言赐教