m5a インスタンス
m5a インスタンス cpuinfo 取り忘れた。 2.5GHz。
code:/proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD EPYC 7571
stepping : 2
microcode : 0x8001227
cpu MHz : 2585.623
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save
bugs : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips : 4399.99
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management:
m5インスタンスの cpuinfo
code:/proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
stepping : 4
microcode : 0x2000043
cpu MHz : 3105.473
cache size : 33792 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes fpu_exception : yes cpuid level : 13 wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology no
nstop_tsc cpuid aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_singl
e pti fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku
ospke
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips : 5000.00
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
pyperformance の比較 (m5a.xlarge vs m5.xlarge)
code:shell
$ .local/bin/pyperf compare_to --table m5a-1.json m5-1.json
+-------------------------+---------+------------------------------+
| Benchmark | m5a-1 | m5-1 |
+=========================+=========+==============================+
| 2to3 | 559 ms | 431 ms: 1.30x faster (-23%) |
+-------------------------+---------+------------------------------+
| chameleon | 18.7 ms | 12.4 ms: 1.51x faster (-34%) |
+-------------------------+---------+------------------------------+
| chaos | 206 ms | 143 ms: 1.44x faster (-31%) |
+-------------------------+---------+------------------------------+
| crypto_pyaes | 182 ms | 137 ms: 1.33x faster (-25%) |
+-------------------------+---------+------------------------------+
| deltablue | 12.5 ms | 9.74 ms: 1.28x faster (-22%) |
+-------------------------+---------+------------------------------+
| django_template | 261 ms | 170 ms: 1.53x faster (-35%) |
+-------------------------+---------+------------------------------+
| dulwich_log | 129 ms | 87.5 ms: 1.47x faster (-32%) |
+-------------------------+---------+------------------------------+
| fannkuch | 805 ms | 598 ms: 1.35x faster (-26%) |
+-------------------------+---------+------------------------------+
| float | 183 ms | 143 ms: 1.28x faster (-22%) |
+-------------------------+---------+------------------------------+
| genshi_text | 59.7 ms | 40.3 ms: 1.48x faster (-32%) |
+-------------------------+---------+------------------------------+
| genshi_xml | 131 ms | 83.5 ms: 1.56x faster (-36%) |
+-------------------------+---------+------------------------------+
| go | 457 ms | 324 ms: 1.41x faster (-29%) |
+-------------------------+---------+------------------------------+
| hexiom | 17.7 ms | 13.3 ms: 1.33x faster (-25%) |
+-------------------------+---------+------------------------------+
| html5lib | 180 ms | 120 ms: 1.50x faster (-33%) |
+-------------------------+---------+------------------------------+
| json_dumps | 21.1 ms | 16.0 ms: 1.32x faster (-24%) |
+-------------------------+---------+------------------------------+
| json_loads | 46.2 us | 32.1 us: 1.44x faster (-31%) |
+-------------------------+---------+------------------------------+
| logging_format | 20.5 us | 13.2 us: 1.55x faster (-36%) |
+-------------------------+---------+------------------------------+
| logging_silent | 554 ns | 422 ns: 1.31x faster (-24%) |
+-------------------------+---------+------------------------------+
| logging_simple | 17.4 us | 11.3 us: 1.54x faster (-35%) |
+-------------------------+---------+------------------------------+
| mako | 31.0 ms | 24.3 ms: 1.28x faster (-22%) |
+-------------------------+---------+------------------------------+
| meteor_contest | 165 ms | 131 ms: 1.26x faster (-20%) |
+-------------------------+---------+------------------------------+
| nbody | 209 ms | 159 ms: 1.31x faster (-24%) |
+-------------------------+---------+------------------------------+
| nqueens | 172 ms | 125 ms: 1.37x faster (-27%) |
+-------------------------+---------+------------------------------+
| pathlib | 31.5 ms | 26.0 ms: 1.21x faster (-17%) |
+-------------------------+---------+------------------------------+
| pickle | 17.1 us | 11.6 us: 1.47x faster (-32%) |
+-------------------------+---------+------------------------------+
| pickle_dict | 42.6 us | 32.8 us: 1.30x faster (-23%) |
+-------------------------+---------+------------------------------+
| pickle_list | 5.00 us | 4.18 us: 1.20x faster (-16%) |
+-------------------------+---------+------------------------------+
| pickle_pure_python | 953 us | 627 us: 1.52x faster (-34%) |
+-------------------------+---------+------------------------------+
| pidigits | 239 ms | 226 ms: 1.06x faster (-5%) |
+-------------------------+---------+------------------------------+
| python_startup | 13.9 ms | 11.9 ms: 1.17x faster (-15%) |
+-------------------------+---------+------------------------------+
| python_startup_no_site | 8.47 ms | 7.38 ms: 1.15x faster (-13%) |
+-------------------------+---------+------------------------------+
| raytrace | 963 ms | 654 ms: 1.47x faster (-32%) |
+-------------------------+---------+------------------------------+
| regex_compile | 314 ms | 233 ms: 1.35x faster (-26%) |
+-------------------------+---------+------------------------------+
| regex_dna | 278 ms | 220 ms: 1.26x faster (-21%) |
+-------------------------+---------+------------------------------+
| regex_effbot | 4.33 ms | 3.73 ms: 1.16x faster (-14%) |
+-------------------------+---------+------------------------------+
| regex_v8 | 37.0 ms | 29.5 ms: 1.26x faster (-20%) |
+-------------------------+---------+------------------------------+
| richards | 138 ms | 95.4 ms: 1.45x faster (-31%) |
+-------------------------+---------+------------------------------+
| scimark_fft | 593 ms | 406 ms: 1.46x faster (-31%) |
+-------------------------+---------+------------------------------+
| scimark_lu | 328 ms | 281 ms: 1.16x faster (-14%) |
+-------------------------+---------+------------------------------+
| scimark_monte_carlo | 192 ms | 133 ms: 1.45x faster (-31%) |
+-------------------------+---------+------------------------------+
| scimark_sor | 386 ms | 282 ms: 1.37x faster (-27%) |
+-------------------------+---------+------------------------------+
| scimark_sparse_mat_mult | 6.18 ms | 4.74 ms: 1.30x faster (-23%) |
+-------------------------+---------+------------------------------+
| spectral_norm | 232 ms | 174 ms: 1.33x faster (-25%) |
+-------------------------+---------+------------------------------+
| sqlalchemy_declarative | 269 ms | 218 ms: 1.24x faster (-19%) |
+-------------------------+---------+------------------------------+
| sqlalchemy_imperative | 55.4 ms | 37.8 ms: 1.47x faster (-32%) |
+-------------------------+---------+------------------------------+
| sqlite_synth | 5.27 us | 3.67 us: 1.44x faster (-30%) |
+-------------------------+---------+------------------------------+
| sympy_expand | 809 ms | 595 ms: 1.36x faster (-26%) |
+-------------------------+---------+------------------------------+
| sympy_integrate | 35.5 ms | 26.8 ms: 1.33x faster (-25%) |
+-------------------------+---------+------------------------------+
| sympy_sum | 173 ms | 135 ms: 1.28x faster (-22%) |
+-------------------------+---------+------------------------------+
| sympy_str | 367 ms | 266 ms: 1.38x faster (-27%) |
+-------------------------+---------+------------------------------+
| telco | 12.9 ms | 8.99 ms: 1.44x faster (-30%) |
+-------------------------+---------+------------------------------+
| tornado_http | 300 ms | 228 ms: 1.32x faster (-24%) |
+-------------------------+---------+------------------------------+
| unpack_sequence | 71.8 ns | 66.1 ns: 1.09x faster (-8%) |
+-------------------------+---------+------------------------------+
| unpickle | 22.6 us | 16.1 us: 1.40x faster (-29%) |
+-------------------------+---------+------------------------------+
| unpickle_list | 5.82 us | 4.60 us: 1.27x faster (-21%) |
+-------------------------+---------+------------------------------+
| unpickle_pure_python | 662 us | 454 us: 1.46x faster (-31%) |
+-------------------------+---------+------------------------------+
| xml_etree_parse | 288 ms | 197 ms: 1.46x faster (-32%) |
+-------------------------+---------+------------------------------+
| xml_etree_iterparse | 186 ms | 140 ms: 1.32x faster (-24%) |
+-------------------------+---------+------------------------------+
| xml_etree_generate | 175 ms | 131 ms: 1.34x faster (-25%) |
+-------------------------+---------+------------------------------+
| xml_etree_process | 144 ms | 107 ms: 1.35x faster (-26%) |
+-------------------------+---------+------------------------------+
m5 の方が速い。
m5a は turbo boost が効いてない 2.5GHzで、 m5 は turbo boost 効いて 3GHz で動いてる気がする。
code:txt
$ openssl speed -engine aesni -evp aes-256-cbc
m5
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 622584.13k 828964.54k 847607.38k 851930.79k 853221.38k 852961.96k
m5a
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 558660.69k 718688.62k 773872.04k 788498.77k 792685.23k 792974.68k
AESの速度はクロック差ほど大きくない?
m5a でなども /proc/cpuinfo を見たら、クロックが 2.5GHz ~ 2.8GHz くらいで変動している。やっぱり boost は効いてる?