![]() Summary: This patch adds a benchmarking infrastructure for llvm-libc memory functions. In a nutshell, the code can benchmark small and large buffers for the memcpy, memset and memcmp functions. It also produces graphs of size vs latency by running targets of the form `render-libc-{memcpy|memset|memcmp}-benchmark-{small|big}`. The configurations are provided as JSON files and the benchmark also produces a JSON file. This file is then parsed and rendered as a PNG file via the `render.py` script (make sure to run `pip3 install matplotlib scipy numpy`). The script can take several JSON files as input and will superimpose the curves if they are from the same host. TODO: - The code benchmarks whatever is available on the host but should be configured to benchmark the -to be added- llvm-libc memory functions. - Add a README file with instructions and rationale. - Produce scores to track the performance of the functions over time to allow for regression detection. Reviewers: sivachandra, ckennelly Subscribers: mgorny, MaskRay, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D72516 |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
configuration_big.json | ||
configuration_small.json | ||
JSON.cpp | ||
JSON.h | ||
JSONTest.cpp | ||
LibcBenchmark.cpp | ||
LibcBenchmark.h | ||
LibcBenchmarkTest.cpp | ||
LibcMemoryBenchmark.cpp | ||
LibcMemoryBenchmark.h | ||
LibcMemoryBenchmarkMain.cpp | ||
LibcMemoryBenchmarkMain.h | ||
LibcMemoryBenchmarkTest.cpp | ||
Memcmp.cpp | ||
Memcpy.cpp | ||
Memset.cpp | ||
RATIONALE.md | ||
README.md | ||
render.py3 |
Libc mem* benchmarks
This framework has been designed to evaluate and compare relative performance of memory function implementations on a particular host.
It will also be use to track implementations performances over time.
Quick start
Setup
Python 2 being deprecated it is advised to used Python 3.
Then make sure to have matplotlib
, scipy
and numpy
setup correctly:
apt-get install python3-pip
pip3 install matplotlib scipy numpy
To get good reproducibility it is important to make sure that the system runs in
performance
mode. This is achieved by running:
cpupower frequency-set --governor performance
Run and display memcpy
benchmark
The following commands will run the benchmark and display a 95 percentile confidence interval curve of time per copied bytes. It also features host informations and benchmarking configuration.
cd llvm-project
cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release
make -C /tmp/build -j display-libc-memcpy-benchmark-small
Benchmarking regimes
Using a profiler to observe size distributions for calls into libc functions, it was found most operations act on a small number of bytes.
Function | % of calls with size ≤ 128 | % of calls with size ≤ 1024 |
---|---|---|
memcpy | 96% | 99% |
memset | 91% | 99.9% |
memcmp1 | 99.5% | ~100% |
Benchmarking configurations come in two flavors:
- small
- Exercises sizes up to
1KiB
, representative of normal usage - The data is kept in the
L1
cache to prevent measuring the memory subsystem
- Exercises sizes up to
- big
- Exercises sizes up to
32MiB
to test large operations - Caching effects can show up here which prevents comparing different hosts
- Exercises sizes up to
1 - The size refers to the size of the buffers to compare and not the number of bytes until the first difference.
Benchmarking targets
The benchmarking process occurs in two steps:
- Benchmark the functions and produce a
json
file - Display (or renders) the
json
file
Targets are of the form <action>-libc-<function>-benchmark-<configuration>
action
is one of :run
, runs the benchmark and writes thejson
filedisplay
, displays the graph on screenrender
, renders the graph on disk as apng
file
function
is one of :memcpy
,memcmp
,memset
configuration
is one of :small
,big
Superposing curves
It is possible to merge several json
files into a single graph. This is
useful to compare implementations.
In the following example we superpose the curves for memcpy
, memset
and
memcmp
:
> make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
> python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json
Useful render.py3
flags
- To save the produced graph
--output=/tmp/benchmark_curve.png
. - To prevent the graph from appearing on the screen
--headless
.
Under the hood
To learn more about the design decisions behind the benchmarking framework, have a look at the RATIONALE.md file.