JMH:微基准测试工具简介
简介
JMH(Java Microbenchmark Harness)是2013由JIT的相关开发人员开发,后来归入了OpenJDK。Micro Benchmark的含义可理解为在 method 层面上的 benchmark。当需要对热点函数进行进一步的优化时,就可以使用 JMH 对优化的效果进行定量的分析。
官方文档是这样介绍JMH的:JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targetting the JVM.
– JMH可以对运行在JVM上的所有语言做基准测试,而且可以分析到纳秒级别。
JMH使用方法
- 使用独立机器在命令行中运行使用Maven构建的依赖于被测试程序的jar文件(官方推荐用法);
- 在IDEA中运行基准测试;
官方推荐用法
JMH官方推荐用法是使用命令行的方式,之所以推荐使用命令行是因为性能测试时要尽量减少其他因素(例开发环境会运行很多其他的程序)对测试结果造成影响,使测试结果与实际运行结果尽量一致,使用命令行测试的关键在第3步-即应该尽量在独立稳定且接近生产的环境中运行测试jar包,而前2步则可以在IDE中完成。
1. 使用Maven生成JMH项目:
$ mvn archetype:generate \
-DinteractiveMode=false \
-DarchetypeGroupId=org.openjdk.jmh \
-DarchetypeArtifactId=jmh-java-benchmark-archetype \
-DgroupId=org.sample \
-DartifactId=test \
-Dversion=1.0
- 构建打包:
$ cd test/
$ mvn clean install
注意:jar文件的名称取决于pom文件
<configuration>
<finalName>microbenchmarks</finalName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>org.openjdk.jmh.Main</mainClass>
</transformer>
</transformers>
</configuration>
- 运行构建好的jar文件:
$ java -jar target/benchmarks.jar
在IDEA中使用JMH
由于要得到准确的测试结果往往需要比较长的时间且稳定的环境,官方所推荐用法更适合于专职测试开发使用,所以在这里着重介绍适用于开发人员使用的IDE中进行微基准测试,下面以IDEA为例进行介绍:
1. 项目中引入JMH相关依赖
JDK1起内部集成JMH,JDK12之前需要引入额外的jar包
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>1.21</version>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>1.21</version>
<scope>test</scope>
</dependency>
- IDEA中安装JMH Plugin插件
-
IDEA中启用注解处理
File --> Setting --> Build execution... --> Compiler --> Annotation Processors --> Enable annotation processing
-
Windows操作系统下IDEA中配置系统变量
-
编写测试代码
public class PSTest {
@Benchmark
@Warmup(iterations = 3, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@Threads(2)
@BenchmarkMode(Mode.Throughput)
@Measurement(iterations = 3, time = 1 ,timeUnit = TimeUnit.SECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public void TestForeach() {
PS.foreach();
}
}
- 注解说明
@Benchmark:标注需要基准测试的代码
@Warmup:在实际进行benchmark前先进行预热,可以用在类或者方法上。
因为 JVM 的 JIT 机制的存在,如果某个函数被调用多次之后,JVM 会尝试将其编译成为机器码从而提高执行速度。所以为了让 benchmark 的结果更加接近真实情况就需要进行预热。
本例@Warmup(iterations = 3, time = 1, timeUnit = TimeUnit.SECONDS)表示对代码预热总计3秒,(迭代3次,每次1秒),预热过程的测试数据不参与结果统计
@Fork:value设置为n则会启动n个进程执行测试,设置为0表示在用户的JVM进程上运行
@Threads:每个Fork进程使用多少条线程去执行测试方法,指定该注解开启并行测试。如配置Thread.MAX则使用和处理机器核数相同的线程数。
默认值为Runtime.getRuntime().availableProcessors()
@BenchmarkMode:该注解可以用于类或方法
其value是一个数组对应Mode选项,可以组合使用,也可以设置为Mode.All
Mode表示JMH进行Benchmark时所使用的模式,目前JMH提供的Mode有四种:
Throughput("thrpt", "Throughput, ops/time"):表示吞吐量,搭配@OutputTimeUnit(TimeUnit.MICROSECONDS)表示每毫秒的吞吐量(即每毫秒多少次操作)
AverageTime("avgt", "Average time, time/op"):表示每次操作需要的平均时间,搭配@OutputTimeUnit(TimeUnit.NANOSECONDS)注解后,基准测试的单位是ns/op,即每次操作的纳秒单位平均时间
SampleTime("sample", "Sampling time"):随机取样,最后输出取样结果的分布
SingleShotTime("ss", "Single shot invocation time"):如果仅仅测试一次性能,例如首次初始化花费了多长时间,可以使用这种模式。往往同时把 warmup 次数设为0,用于测试冷启动时的性能
All:所有指标都测一遍
@Measurement:表示测试次数
本例@Measurement(iterations = 3, time = 1 ,timeUnit = TimeUnit.SECONDS)表示循环运行3次,总计时间3秒。JMH 会在1个iterations内不断调用需要 benchmark的方法。
@OutputTimeUnit:benchmark结果所使用的时间单位
该注解value为java.util.concurrent.TimeUnit中的标准时间单位
结果解读
- Benchmark mode: Throughput, ops/time,可以看出本例吞吐量为:
10⁻¹¹ ops/ns
# JMH version: 1.21
# VM version: JDK 1.8.0_202, Java HotSpot(TM) 64-Bit Server VM, 25.202-b08
# VM invoker: C:\Program Files\Java\jdk1.8.0_202\jre\bin\java.exe
# VM options: -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 1 s each
# Measurement: 3 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 2 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: org.xxx.jmh.PSTest.TestForeach
# Run progress: 0.00% complete, ETA 00:00:18
# Fork: 1 of 1
# Warmup Iteration 1: ≈ 10⁻¹¹ ops/ns
# Warmup Iteration 2: ≈ 10⁻¹¹ ops/ns
# Warmup Iteration 3: ≈ 10⁻¹¹ ops/ns
Iteration 1: ≈ 10⁻¹¹ ops/ns
Iteration 2: ≈ 10⁻¹¹ ops/ns
Iteration 3: ≈ 10⁻¹¹ ops/ns
Result "org.xxx.jmh.PSTest.TestForeach":
≈ 10⁻¹¹ ops/ns
- Benchmark mode: Average time, time/op 可以看出本例运行时间在
151065620266.667ns/op ±53143335382.389 ns/op
之间
# JMH version: 1.21
# VM version: JDK 1.8.0_202, Java HotSpot(TM) 64-Bit Server VM, 25.202-b08
# VM invoker: C:\Program Files\Java\jdk1.8.0_202\jre\bin\java.exe
# VM options: -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 1 s each
# Measurement: 3 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 2 threads, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.xxx.jmh.PSTest.TestForeach
# Run progress: 33.32% complete, ETA 01:04:38
# Fork: 1 of 1
# Warmup Iteration 1: 136428523750.000 ns/op
# Warmup Iteration 2: 140141726900.000 ns/op
# Warmup Iteration 3: 143394217650.000 ns/op
Iteration 1: 147937076350.000 ns/op
Iteration 2: 153699718350.000 ns/op
Iteration 3: 151560066100.000 ns/op
Result "org.xxx.jmh.PSTest.TestForeach":
151065620266.667 ±(99.9%) 53143335382.389 ns/op [Average]
(min, avg, max) = (147937076350.000, 151065620266.667, 153699718350.000), stdev = 2912965536.462
CI (99.9%): [97922284884.277, 204208955649.056] (assumes normal distribution)
- Benchmark mode: Sampling time 可以看出
99.9%
的调用在195286794240.000 ±(99.9%) 36811371680.952 ns/op
内完成
# JMH version: 1.21
# VM version: JDK 1.8.0_202, Java HotSpot(TM) 64-Bit Server VM, 25.202-b08
# VM invoker: C:\Program Files\Java\jdk1.8.0_202\jre\bin\java.exe
# VM options: -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 1 s each
# Measurement: 3 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 2 threads, will synchronize iterations
# Benchmark mode: Sampling time
# Benchmark: org.xxx.jmh.PSTest.TestForeach
# Run progress: 66.64% complete, ETA 00:34:20
# Fork: 1 of 1
# Warmup Iteration 1: 189515431936.000 ns/op
# Warmup Iteration 2: 185757335552.000 ns/op
# Warmup Iteration 3: 199581761536.000 ns/op
Iteration 1: 207903260672.000 ns/op
TestForeach·p0.00: 205084688384.000 ns/op
TestForeach·p0.50: 207903260672.000 ns/op
TestForeach·p0.90: 210721832960.000 ns/op
TestForeach·p0.95: 210721832960.000 ns/op
TestForeach·p0.99: 210721832960.000 ns/op
TestForeach·p0.999: 210721832960.000 ns/op
TestForeach·p0.9999: 210721832960.000 ns/op
TestForeach·p1.00: 210721832960.000 ns/op
Iteration 2: 196226318336.000 ns/op
TestForeach·p0.00: 188441690112.000 ns/op
TestForeach·p0.50: 196226318336.000 ns/op
TestForeach·p0.90: 204010946560.000 ns/op
TestForeach·p0.95: 204010946560.000 ns/op
TestForeach·p0.99: 204010946560.000 ns/op
TestForeach·p0.999: 204010946560.000 ns/op
TestForeach·p0.9999: 204010946560.000 ns/op
TestForeach·p1.00: 204010946560.000 ns/op
Iteration 3: 181730803712.000 ns/op
TestForeach·p0.00: 177435836416.000 ns/op
TestForeach·p0.50: 181730803712.000 ns/op
TestForeach·p0.90: 186025771008.000 ns/op
TestForeach·p0.95: 186025771008.000 ns/op
TestForeach·p0.99: 186025771008.000 ns/op
TestForeach·p0.999: 186025771008.000 ns/op
TestForeach·p0.9999: 186025771008.000 ns/op
TestForeach·p1.00: 186025771008.000 ns/op
Result "org.xxx.jmh.PSTest.TestForeach":
N = 6
mean = 195286794240.000 ±(99.9%) 36811371680.952 ns/op
Histogram, ns/op:
[170000000000.000, 175000000000.000) = 0
[175000000000.000, 180000000000.000) = 1
[180000000000.000, 185000000000.000) = 0
[185000000000.000, 190000000000.000) = 2
[190000000000.000, 195000000000.000) = 0
[195000000000.000, 200000000000.000) = 0
[200000000000.000, 205000000000.000) = 1
[205000000000.000, 210000000000.000) = 1
[210000000000.000, 215000000000.000) = 1
Percentiles, ns/op:
p(0.0000) = 177435836416.000 ns/op
p(50.0000) = 196226318336.000 ns/op
p(90.0000) = 210721832960.000 ns/op
p(95.0000) = 210721832960.000 ns/op
p(99.0000) = 210721832960.000 ns/op
p(99.9000) = 210721832960.000 ns/op
p(99.9900) = 210721832960.000 ns/op
p(99.9990) = 210721832960.000 ns/op
p(99.9999) = 210721832960.000 ns/op
p(100.0000) = 210721832960.000 ns/op
- Benchmark mode: Single shot invocation time
# JMH version: 1.21
# VM version: JDK 1.8.0_202, Java HotSpot(TM) 64-Bit Server VM, 25.202-b08
# VM invoker: C:\Program Files\Java\jdk1.8.0_202\jre\bin\java.exe
# VM options: -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 1 s each
# Measurement: 3 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 2 threads
# Benchmark mode: Single shot invocation time
# Benchmark: org.xxx.jmh.PSTest.TestForeach
# Run progress: 99.97% complete, ETA 00:00:02
# Fork: 1 of 1
# Warmup Iteration 1: 195781762000.000 ns/op
# Warmup Iteration 2: 182082962950.000 ns/op
# Warmup Iteration 3: 189026715650.000 ns/op
Iteration 1: 193941255250.000 ns/op
Iteration 2: 193118517900.000 ns/op
Iteration 3: 194625430800.000 ns/op
Result "org.zlmax.jmh.PSTest.TestForeach":
N = 3
mean = 193895067983.333 ±(99.9%) 13765206951.937 ns/op
Histogram, ns/op:
[193000000000.000, 193125000000.000) = 1
[193125000000.000, 193250000000.000) = 0
[193250000000.000, 193375000000.000) = 0
[193375000000.000, 193500000000.000) = 0
[193500000000.000, 193625000000.000) = 0
[193625000000.000, 193750000000.000) = 0
[193750000000.000, 193875000000.000) = 0
[193875000000.000, 194000000000.000) = 1
[194000000000.000, 194125000000.000) = 0
[194125000000.000, 194250000000.000) = 0
[194250000000.000, 194375000000.000) = 0
[194375000000.000, 194500000000.000) = 0
[194500000000.000, 194625000000.000) = 0
[194625000000.000, 194750000000.000) = 1
[194750000000.000, 194875000000.000) = 0
Percentiles, ns/op:
p(0.0000) = 193118517900.000 ns/op
p(50.0000) = 193941255250.000 ns/op
p(90.0000) = 194625430800.000 ns/op
p(95.0000) = 194625430800.000 ns/op
p(99.0000) = 194625430800.000 ns/op
p(99.9000) = 194625430800.000 ns/op
p(99.9900) = 194625430800.000 ns/op
p(99.9990) = 194625430800.000 ns/op
p(99.9999) = 194625430800.000 ns/op
p(100.0000) = 194625430800.000 ns/op
# Run complete. Total time: 02:14:54
(最后的提示是要大家谨慎对待测试结果)
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
PSTest.TestForeach thrpt 3 ≈ 10⁻¹¹ ops/ns
PSTest.TestForeach avgt 3 151065620266.667 ± 53143335382.389 ns/op
PSTest.TestForeach sample 6 195286794240.000 ± 36811371680.952 ns/op
PSTest.TestForeach:TestForeach·p0.00 sample 177435836416.000 ns/op
PSTest.TestForeach:TestForeach·p0.50 sample 196226318336.000 ns/op
PSTest.TestForeach:TestForeach·p0.90 sample 210721832960.000 ns/op
PSTest.TestForeach:TestForeach·p0.95 sample 210721832960.000 ns/op
PSTest.TestForeach:TestForeach·p0.99 sample 210721832960.000 ns/op
PSTest.TestForeach:TestForeach·p0.999 sample 210721832960.000 ns/op
PSTest.TestForeach:TestForeach·p0.9999 sample 210721832960.000 ns/op
PSTest.TestForeach:TestForeach·p1.00 sample 210721832960.000 ns/op
PSTest.TestForeach ss 3 193895067983.333 ± 13765206951.937 ns/op
将结果图形化
对JMH的测试结果进行二次加工,将其转为图表,结果更加直观。只需在运行时指定输出文件的格式,即可获得相应数据格式:
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JMHSampleDemo.class.getSimpleName())
.resultFormat(ResultFormatType.JSON)
.build();
new Runner(opt).run();
}
- JMH支持的数据格式
- TEXT 导出文本文件
- JSON 导出json文件
- CSV 导出csv文件
- SCSV 导出scsv文件
- LATEX 导出一种基于TEX的排版系统的文件
- 图形化工具
典型应用场景
- 想准确的知道某个方法需要执行多长时间,以及执行时间和输入n之间的相关性;
- 对比接口不同实现在给定条件下的吞吐量;
- 查看多少百分比的请求在多长时间内完成;
官方案例
官方提供了很多样例学习,有兴趣大家可以自行查看:
官方按例