SIMD library
The SIMD library provides portable types for explicitly stating data-parallelism and structuring data for more efficient SIMD access.
An object of type
simd<T>
behaves analogue to objects of type
T
. But while
T
stores and manipulates one value,
simd<T>
stores and manipulates multiple values (called
width
but identified as
size
for consistency with the rest of the standard library; cf.
simd_size
).
Every operator and operation on
simd<T>
acts
element-wise
(except for
horizontal
operations, which are clearly marked as such). This simple rule expresses data-parallelism and will be used by the compiler to generate SIMD instructions and/or independent execution streams.
The width of the types
simd<T>
and
native_simd<T>
is determined by the implementation at compile-time. In contrast, the width of the type
fixed_size_simd<T, N>
is fixed by the developer to a certain size.
A recommended pattern for using a mix of different SIMD types with high efficiency uses native_simd and rebind_simd :
#include <experimental/simd> namespace stdx = std::experimental; using floatv = stdx::native_simd<float>; using doublev = stdx::rebind_simd_t<double, floatv>; using intv = stdx::rebind_simd_t<int, floatv>;
This ensures that the set of types all have the same width and thus can be interconverted. A conversion with mismatching width is not defined because it would either drop values or have to invent values. For resizing operations, the SIMD library provides the split and concat functions.
Defined in header
<experimental/simd>
|
Main classes
(parallelism TS v2)
|
data-parallel vector type
(class template) |
(parallelism TS v2)
|
data-parallel type with the element type bool
(class template) |
ABI tags
Defined in namespace
std::experimental::simd_abi
|
|
(parallelism TS v2)
|
tag type for storing a single element
(typedef) |
(parallelism TS v2)
|
tag type for storing specified number of elements
(alias template) |
(parallelism TS v2)
|
tag type that ensures ABI compatibility
(alias template) |
(parallelism TS v2)
|
tag type that is most efficient
(alias template) |
(parallelism TS v2)
|
the maximum number of elements guaranteed to be supported by fixed
(constant) |
(parallelism TS v2)
|
obtains an ABI type for given element type and number of elements
(class template) |
Alignment tags
(parallelism TS v2)
|
flag indicating alignment of the load/store address to element alignment
(class) |
(parallelism TS v2)
|
flag indicating alignment of the load/store address to vector alignment
(class) |
(parallelism TS v2)
|
flag indicating alignment of the load/store address to the specified alignment
(class template) |
Where expression
(parallelism TS v2)
|
selected elements with non-mutating operations
(class template) |
(parallelism TS v2)
|
selected elements with mutating operations
(class template) |
(parallelism TS v2)
|
produces const_where_expression and where_expression
(function template) |
Casts
(parallelism TS v2)
|
element-wise static_cast
(function template) |
(parallelism TS v2)
|
element-wise ABI cast
(function template) |
(parallelism TS v2)
|
splits single simd object to multiple ones
(function template) |
(parallelism TS v2)
|
concatenates multiple simd objects to a single one
(function template) |
Algorithms
(parallelism TS v2)
|
element-wise min operation
(function template) |
(parallelism TS v2)
|
element-wise max operation
(function template) |
(parallelism TS v2)
|
element-wise minmax operation
(function template) |
(parallelism TS v2)
|
element-wise clamp operation
(function template) |
Reduction
(parallelism TS v2)
|
reduces the vector to a single element
(function template) |
Mask reduction
(parallelism TS v2)
|
reductions of
simd_mask
to
bool
(function template) |
(parallelism TS v2)
|
reduction of
simd_mask
to the number of
true
values
(function template) |
(parallelism TS v2)
|
reductions of
simd_mask
to the index of the first or last
true
value
(function template) |
Traits
(parallelism TS v2)
|
checks if a type is a
simd
or
simd_mask
type
(class template) |
(parallelism TS v2)
|
checks if a type is an ABI tag type
(class template) |
(parallelism TS v2)
|
checks if a type is a simd flag type
(class template) |
(parallelism TS v2)
|
obtains the number of elements of a given element type and ABI tag
(class template) |
(parallelism TS v2)
|
obtains an appropriate alignment for
vector_aligned
(class template) |
(parallelism TS v2)
|
change element type or the number of elements of
simd
or
simd_mask
(class template) |
Math functions
All functions in
<cmath>
, except for the special math functions, are overloaded for
simd
.
Example
#include <experimental/simd> #include <iostream> #include <string_view> namespace stdx = std::experimental; void println(std::string_view name, auto const& a) { std::cout << name << ": "; for (std::size_t i{}; i != std::size(a); ++i) std::cout << a[i] << ' '; std::cout << '\n'; } template<class A> stdx::simd<int, A> my_abs(stdx::simd<int, A> x) { where(x < 0, x) = -x; return x; } int main() { const stdx::native_simd<int> a = 1; println("a", a); const stdx::native_simd<int> b([](int i) { return i - 2; }); println("b", b); const auto c = a + b; println("c", c); const auto d = my_abs(c); println("d", d); const auto e = d * d; println("e", e); const auto inner_product = stdx::reduce(e); std::cout << "inner product: " << inner_product << '\n'; const stdx::fixed_size_simd<long double, 16> x([](int i) { return i; }); println("x", x); println("cos²(x) + sin²(x)", stdx::pow(stdx::cos(x), 2) + stdx::pow(stdx::sin(x), 2)); }
Output:
a: 1 1 1 1 b: -2 -1 0 1 c: -1 0 1 2 d: 1 0 1 2 e: 1 0 1 4 inner product: 6 x: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 cos²(x) + sin²(x): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
See also
numeric arrays, array masks and array slices
(class template) |
External links
1. | The implementation of ISO/IEC TS 19570:2018 Section 9 "Data-Parallel Types" — github.com |
2. | TS implementation reach for GCC/libstdc++ (std::experimental::simd is shipping with GCC-11) — gcc.gnu.org |