1<html lang="en"> 2<head> 3<title>Vector Extensions - Using the GNU Compiler Collection (GCC)</title> 4<meta http-equiv="Content-Type" content="text/html"> 5<meta name="description" content="Using the GNU Compiler Collection (GCC)"> 6<meta name="generator" content="makeinfo 4.13"> 7<link title="Top" rel="start" href="index.html#Top"> 8<link rel="up" href="C-Extensions.html#C-Extensions" title="C Extensions"> 9<link rel="prev" href="Return-Address.html#Return-Address" title="Return Address"> 10<link rel="next" href="Offsetof.html#Offsetof" title="Offsetof"> 11<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> 12<!-- 13Copyright (C) 1988-2013 Free Software Foundation, Inc. 14 15Permission is granted to copy, distribute and/or modify this document 16under the terms of the GNU Free Documentation License, Version 1.3 or 17any later version published by the Free Software Foundation; with the 18Invariant Sections being ``Funding Free Software'', the Front-Cover 19Texts being (a) (see below), and with the Back-Cover Texts being (b) 20(see below). A copy of the license is included in the section entitled 21``GNU Free Documentation License''. 22 23(a) The FSF's Front-Cover Text is: 24 25 A GNU Manual 26 27(b) The FSF's Back-Cover Text is: 28 29 You have freedom to copy and modify this GNU Manual, like GNU 30 software. Copies published by the Free Software Foundation raise 31 funds for GNU development.--> 32<meta http-equiv="Content-Style-Type" content="text/css"> 33<style type="text/css"><!-- 34 pre.display { font-family:inherit } 35 pre.format { font-family:inherit } 36 pre.smalldisplay { font-family:inherit; font-size:smaller } 37 pre.smallformat { font-family:inherit; font-size:smaller } 38 pre.smallexample { font-size:smaller } 39 pre.smalllisp { font-size:smaller } 40 span.sc { font-variant:small-caps } 41 span.roman { font-family:serif; font-weight:normal; } 42 span.sansserif { font-family:sans-serif; font-weight:normal; } 43--></style> 44<link rel="stylesheet" type="text/css" href="../cs.css"> 45</head> 46<body> 47<div class="node"> 48<a name="Vector-Extensions"></a> 49<p> 50Next: <a rel="next" accesskey="n" href="Offsetof.html#Offsetof">Offsetof</a>, 51Previous: <a rel="previous" accesskey="p" href="Return-Address.html#Return-Address">Return Address</a>, 52Up: <a rel="up" accesskey="u" href="C-Extensions.html#C-Extensions">C Extensions</a> 53<hr> 54</div> 55 56<h3 class="section">6.49 Using Vector Instructions through Built-in Functions</h3> 57 58<p>On some targets, the instruction set contains SIMD vector instructions which 59operate on multiple values contained in one large register at the same time. 60For example, on the i386 the MMX, 3DNow! and SSE extensions can be used 61this way. 62 63 <p>The first step in using these extensions is to provide the necessary data 64types. This should be done using an appropriate <code>typedef</code>: 65 66<pre class="smallexample"> typedef int v4si __attribute__ ((vector_size (16))); 67</pre> 68 <p class="noindent">The <code>int</code> type specifies the base type, while the attribute specifies 69the vector size for the variable, measured in bytes. For example, the 70declaration above causes the compiler to set the mode for the <code>v4si</code> 71type to be 16 bytes wide and divided into <code>int</code> sized units. For 72a 32-bit <code>int</code> this means a vector of 4 units of 4 bytes, and the 73corresponding mode of <code>foo</code> is <acronym>V4SI</acronym>. 74 75 <p>The <code>vector_size</code> attribute is only applicable to integral and 76float scalars, although arrays, pointers, and function return values 77are allowed in conjunction with this construct. Only sizes that are 78a power of two are currently allowed. 79 80 <p>All the basic integer types can be used as base types, both as signed 81and as unsigned: <code>char</code>, <code>short</code>, <code>int</code>, <code>long</code>, 82<code>long long</code>. In addition, <code>float</code> and <code>double</code> can be 83used to build floating-point vector types. 84 85 <p>Specifying a combination that is not valid for the current architecture 86causes GCC to synthesize the instructions using a narrower mode. 87For example, if you specify a variable of type <code>V4SI</code> and your 88architecture does not allow for this specific SIMD type, GCC 89produces code that uses 4 <code>SIs</code>. 90 91 <p>The types defined in this manner can be used with a subset of normal C 92operations. Currently, GCC allows using the following operators 93on these types: <code>+, -, *, /, unary minus, ^, |, &, ~, %</code>. 94 95 <p>The operations behave like C++ <code>valarrays</code>. Addition is defined as 96the addition of the corresponding elements of the operands. For 97example, in the code below, each of the 4 elements in <var>a</var> is 98added to the corresponding 4 elements in <var>b</var> and the resulting 99vector is stored in <var>c</var>. 100 101<pre class="smallexample"> typedef int v4si __attribute__ ((vector_size (16))); 102 103 v4si a, b, c; 104 105 c = a + b; 106</pre> 107 <p>Subtraction, multiplication, division, and the logical operations 108operate in a similar manner. Likewise, the result of using the unary 109minus or complement operators on a vector type is a vector whose 110elements are the negative or complemented values of the corresponding 111elements in the operand. 112 113 <p>It is possible to use shifting operators <code><<</code>, <code>>></code> on 114integer-type vectors. The operation is defined as following: <code>{a0, 115a1, ..., an} >> {b0, b1, ..., bn} == {a0 >> b0, a1 >> b1, 116..., an >> bn}</code>. Vector operands must have the same number of 117elements. 118 119 <p>For convenience, it is allowed to use a binary vector operation 120where one operand is a scalar. In that case the compiler transforms 121the scalar operand into a vector where each element is the scalar from 122the operation. The transformation happens only if the scalar could be 123safely converted to the vector-element type. 124Consider the following code. 125 126<pre class="smallexample"> typedef int v4si __attribute__ ((vector_size (16))); 127 128 v4si a, b, c; 129 long l; 130 131 a = b + 1; /* a = b + {1,1,1,1}; */ 132 a = 2 * b; /* a = {2,2,2,2} * b; */ 133 134 a = l + a; /* Error, cannot convert long to int. */ 135</pre> 136 <p>Vectors can be subscripted as if the vector were an array with 137the same number of elements and base type. Out of bound accesses 138invoke undefined behavior at run time. Warnings for out of bound 139accesses for vector subscription can be enabled with 140<samp><span class="option">-Warray-bounds</span></samp>. 141 142 <p>Vector comparison is supported with standard comparison 143operators: <code>==, !=, <, <=, >, >=</code>. Comparison operands can be 144vector expressions of integer-type or real-type. Comparison between 145integer-type vectors and real-type vectors are not supported. The 146result of the comparison is a vector of the same width and number of 147elements as the comparison operands with a signed integral element 148type. 149 150 <p>Vectors are compared element-wise producing 0 when comparison is false 151and -1 (constant of the appropriate type where all bits are set) 152otherwise. Consider the following example. 153 154<pre class="smallexample"> typedef int v4si __attribute__ ((vector_size (16))); 155 156 v4si a = {1,2,3,4}; 157 v4si b = {3,2,1,4}; 158 v4si c; 159 160 c = a > b; /* The result would be {0, 0,-1, 0} */ 161 c = a == b; /* The result would be {0,-1, 0,-1} */ 162</pre> 163 <p>Vector shuffling is available using functions 164<code>__builtin_shuffle (vec, mask)</code> and 165<code>__builtin_shuffle (vec0, vec1, mask)</code>. 166Both functions construct a permutation of elements from one or two 167vectors and return a vector of the same type as the input vector(s). 168The <var>mask</var> is an integral vector with the same width (<var>W</var>) 169and element count (<var>N</var>) as the output vector. 170 171 <p>The elements of the input vectors are numbered in memory ordering of 172<var>vec0</var> beginning at 0 and <var>vec1</var> beginning at <var>N</var>. The 173elements of <var>mask</var> are considered modulo <var>N</var> in the single-operand 174case and modulo 2*<var>N</var> in the two-operand case. 175 176 <p>Consider the following example, 177 178<pre class="smallexample"> typedef int v4si __attribute__ ((vector_size (16))); 179 180 v4si a = {1,2,3,4}; 181 v4si b = {5,6,7,8}; 182 v4si mask1 = {0,1,1,3}; 183 v4si mask2 = {0,4,2,5}; 184 v4si res; 185 186 res = __builtin_shuffle (a, mask1); /* res is {1,2,2,4} */ 187 res = __builtin_shuffle (a, b, mask2); /* res is {1,5,3,6} */ 188</pre> 189 <p>Note that <code>__builtin_shuffle</code> is intentionally semantically 190compatible with the OpenCL <code>shuffle</code> and <code>shuffle2</code> functions. 191 192 <p>You can declare variables and use them in function calls and returns, as 193well as in assignments and some casts. You can specify a vector type as 194a return type for a function. Vector types can also be used as function 195arguments. It is possible to cast from one vector type to another, 196provided they are of the same size (in fact, you can also cast vectors 197to and from other datatypes of the same size). 198 199 <p>You cannot operate between vectors of different lengths or different 200signedness without a cast. 201 202 </body></html> 203 204