SIMD version of the transposition
- The dilation caneva now uses the SIMD-optimized version of the transposition
- New buffer primitives in
mln::bp
namespace:- Allocations:
mln::bp::aligned_alloc_2d
mln::bp::aligned_free_2d
- Copy:
mln::bp::copy
- Swap:
mln::bp::swap
- Transpose:
mln::bp::transpose
- Allocations:
Performances
Benchmark | Time | CPU | Time Old | Time New |
---|---|---|---|---|
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/64 |
-0.4923 | -0.4923 | 1403 | 713 |
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/128 |
-0.4592 | -0.4592 | 5698 | 3082 |
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/131 |
-0.4474 | -0.4474 | 6091 | 3366 |
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/256 |
-0.7137 | -0.7137 | 46747 | 13383 |
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/64 |
-0.2347 | -0.2347 | 1653 | 1265 |
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/128 |
-0.5331 | -0.5331 | 11546 | 5390 |
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/131 |
-0.5235 | -0.5236 | 12246 | 5836 |
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/256 |
-0.6077 | -0.6078 | 55438 | 21747 |
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/64 |
-0.3227 | -0.3227 | 2822 | 1911 |
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/128 |
-0.4731 | -0.4732 | 14469 | 7624 |
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/131 |
-0.4767 | -0.4767 | 15168 | 7937 |
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/256 |
-0.5011 | -0.5011 | 63214 | 31536 |
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/64 |
-0.7059 | -0.7059 | 11035 | 3246 |
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/128 |
-0.7222 | -0.7222 | 48095 | 13361 |
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/131 |
-0.7245 | -0.7245 | 50621 | 13949 |
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/256 |
-0.7105 | -0.7105 | 202610 | 58661 |
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/64 |
-0.3464 | -0.3466 | 2945 | 1925 |
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/128 |
-0.4881 | -0.4881 | 14801 | 7577 |
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/131 |
-0.4934 | -0.4933 | 15702 | 7955 |
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/256 |
-0.5099 | -0.5100 | 65050 | 31882 |
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/64 |
-0.6916 | -0.6916 | 10478 | 3232 |
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/128 |
-0.7084 | -0.7085 | 46015 | 13417 |
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/131 |
-0.7111 | -0.7111 | 48427 | 13993 |
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/256 |
-0.6985 | -0.6986 | 196441 | 59229 |
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/64 |
-0.0167 | -0.0168 | 3115 | 3063 |
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/128 |
-0.0416 | -0.0411 | 12636 | 12111 |
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/131 |
-0.0455 | -0.0450 | 13461 | 12849 |
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/256 |
-0.0200 | -0.0196 | 62205 | 60963 |
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/64 |
-0.5446 | -0.5447 | 4217 | 1920 |
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/128 |
-0.5853 | -0.5852 | 18559 | 7697 |
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/131 |
-0.5833 | -0.5833 | 19507 | 8128 |
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/256 |
-0.5981 | -0.5981 | 79421 | 31917 |
Benchmark | Time | CPU | Time Old | Time New |
---|---|---|---|---|
BMMorpho/Dilation_ApproximatedDisc/2_mean |
-22.23% | -22.23% | 73731949 | 57343963 |
BMMorpho/Dilation_ApproximatedDisc/4_mean |
-05.48% | -05.47% | 280135607 | 264795373 |
BMMorpho/Dilation_ApproximatedDisc/8_mean |
-02.21% | -02.20% | 531229808 | 519475166 |
BMMorpho/Dilation_ApproximatedDisc/16_mean |
-05.33% | -05.33% | 564639932 | 534519001 |
BMMorpho/Dilation_ApproximatedDisc/32_mean |
-04.92% | -04.90% | 944361277 | 897853724 |
BMMorpho/Dilation_ApproximatedDisc/64_mean |
-07.48% | -07.47% | 1300030125 | 1202781142 |
BMMorpho/Dilation_ApproximatedDisc/128_mean |
-04.80% | -04.82% | 2292119667 | 2182019690 |
BMMorpho/Dilation_ApproximatedDisc_parallel/2_mean |
-06.39% | -08.97% | 23494457 | 21992599 |
BMMorpho/Dilation_ApproximatedDisc_parallel/4_mean |
-02.51% | -03.86% | 70858876 | 69081980 |
BMMorpho/Dilation_ApproximatedDisc_parallel/8_mean |
-03.06% | -03.18% | 139158983 | 134895752 |
BMMorpho/Dilation_ApproximatedDisc_parallel/16_mean |
-03.62% | -03.52% | 143996170 | 138783845 |
BMMorpho/Dilation_ApproximatedDisc_parallel/32_mean |
-03.44% | -02.44% | 243122499 | 234752964 |
BMMorpho/Dilation_ApproximatedDisc_parallel/64_mean |
-07.70% | -05.48% | 362112873 | 334222909 |
BMMorpho/Dilation_ApproximatedDisc_parallel/128_mean |
-07.42% | -06.82% | 716831490 | 663630938 |
BMMorpho/Dilation_EuclideanDisc_naive/4_mean |
-08.89% | -08.92% | 3024343370 | 2755607022 |
BMMorpho/Dilation_EuclideanDisc_naive/16_mean |
-04.07% | -04.04% | 19220983258 | 18439328510 |
BMMorpho/Dilation_EuclideanDisc_incremental/4_mean |
+04.22% | +04.13% | 2112364117 | 2201433522 |
BMMorpho/Dilation_EuclideanDisc_incremental/8_mean |
-13.32% | -13.32% | 3172277202 | 2749743068 |
BMMorpho/Dilation_EuclideanDisc_incremental/16_mean |
-10.04% | -10.04% | 4373858593 | 3934788860 |
BMMorpho/Dilation_EuclideanDisc_incremental/32_mean |
-14.28% | -14.25% | 9020567816 | 7732003076 |
BMMorpho/Dilation_EuclideanDisc_incremental/128_mean |
-01.64% | -01.64% | 48347915089 | 47556850789 |
BMMorpho/Dilation_Square/2_mean |
-21.65% | -21.64% | 79123420 | 61993627 |
BMMorpho/Dilation_Square/4_mean |
-22.38% | -22.36% | 74385806 | 57740013 |
BMMorpho/Dilation_Square/8_mean |
-24.04% | -24.05% | 68613793 | 52121601 |
BMMorpho/Dilation_Square/16_mean |
-20.15% | -20.17% | 73740573 | 58878521 |
BMMorpho/Dilation_Square/32_mean |
-31.60% | -31.62% | 92353130 | 63166265 |
BMMorpho/Dilation_Square/64_mean |
-33.38% | -33.40% | 143492419 | 95588052 |
BMMorpho/Dilation_Square/128_mean |
-42.82% | -42.84% | 319995133 | 182963168 |
BMMorpho/Dilation_Square_parallel/2_mean |
-08.82% | -08.91% | 23400252 | 21336825 |
BMMorpho/Dilation_Square_parallel/4_mean |
-10.28% | -09.97% | 22411414 | 20108000 |
BMMorpho/Dilation_Square_parallel/8_mean |
-10.06% | -09.53% | 23230937 | 20894365 |
BMMorpho/Dilation_Square_parallel/16_mean |
-10.44% | -10.79% | 25446166 | 22789672 |
BMMorpho/Dilation_Square_parallel/32_mean |
-09.54% | -10.44% | 32497370 | 29397984 |
BMMorpho/Dilation_Square_parallel/64_mean |
-08.81% | -09.18% | 62154230 | 56675758 |
BMMorpho/Dilation_Square_parallel/128_mean |
-10.03% | -07.36% | 211265735 | 190080132 |
BMMorpho/Opening_Disc_mean |
-02.92% | -02.93% | 1904755125 | 1849192892 |