Knowledge

SSE4

Source 📝

169:-based FX processors. With SSE4a the misaligned SSE feature was also introduced which meant unaligned load instructions were as fast as aligned versions on aligned addresses. It also allowed disabling the alignment check on non-load SSE operations accessing memory. Intel later introduced similar speed improvements to unaligned SSE in their Nehalem processors, but did not introduce misaligned access by non-load SSE instructions until 121:
addition/multiplication and vector scalar addition/multiplication, process multiple bytes of data in a single CPU instruction. The parallel operation packs noticeable increases in performance. SSE4.2 introduced new SIMD string operations, including an instruction to compare two string fragments of up to 16 bytes each. SSE4.2 is a subset of SSE4 and it was released a few years after the initial release of SSE4.
25: 193:
processor line, was referred to as SSE4 by some media until Intel came up with the SSSE3 moniker. Internally dubbed Merom New Instructions, Intel originally did not plan to assign a special name to them, which was criticized by some journalists. Intel eventually cleared up the confusion and reserved
120:
Like other previous generation CPU SIMD instruction sets, SSE4 supports up to 16 registers, each 128-bits wide which can load four 32-bit integers, four 32-bit single precision floating point numbers, or two 64-bit double precision floating point numbers. SIMD operations, such as vector element-wise
116:
instruction set which was released in early 2004. All software using previous Intel SIMD instructions (ex. SSE3) are compatible with modern microprocessors supporting SSE4 instructions. All existing software continues to run correctly without modification on microprocessors that incorporate SSE4, as
446:
for AOS (Array of Structs) data. This takes an immediate operand consisting of four (or two for DPPD) bits to select which of the entries in the input to multiply and accumulate, and another four (or two for DPPD) to select whether to put 0 or the dot-product in the appropriate field of the output.
811:
These instructions operate on integer rather than SSE registers, because they are not SIMD instructions, but appear at the same time and although introduced by AMD with the SSE4a instruction set, they are counted as separate extensions with their own dedicated CPUID bits to indicate support. Intel
573:
The INSERTPS and PINSR instructions read 8, 16 or 32 bits from an x86 register or memory location and inserts it into a field in the destination register given by an immediate operand. EXTRACTPS and PEXTR read a field from the source register and insert it into an x86 register or memory location.
209:
Unlike all previous iterations of SSE, SSE4 contains instructions that execute operations which are not specific to multimedia applications. It features a number of instructions whose action is determined by a constant field and a set of instructions that take XMM0 as an implicit third operand.
164:
instruction set, which has four SSE4 instructions and four new SSE instructions. These instructions are not found in Intel's processors supporting SSE4.1 and AMD processors only started supporting Intel's SSE4.1 and SSE4.2 (the full SSE4 instruction set) in the
690:
SSE4.2 added STTNI (String and Text New Instructions), several new instructions that perform character searches and comparison on two operands of 16 bytes at a time. These were designed (among other things) to speed up the parsing of
1600: 1574: 1553: 1431: 1623: 409:
Sets the bottom unsigned 16-bit word of the destination to the smallest unsigned 16-bit word in the source, and the next-from-bottom to the index of that word in the source.
477:
Conditional copying of elements in one location with another, based (for non-V form) on the bits in an immediate operand, and (for V form) on the bits in register XMM0.
1775: 1461: 1363: 1693: 646:
to the result of an AND between its operands: ZF is set, if DEST AND SRC is equal to 0. Additionally it sets the C flag if (NOT DEST) AND SRC equals zero.
1504: 1794: 1596: 679:
Efficient read from write-combining memory area into SSE register; this is useful for retrieving results from peripherals attached to the memory bus.
43: 1816: 1880: 1652: 1406: 1570: 1550: 1382: 213:
Several of these instructions are enabled by the single-cycle shuffle engine in Penryn. (Shuffle operations reorder bytes within a register.)
648:
This is equivalent to setting the Z flag if none of the bits masked by SRC are set, and the C flag if all of the bits masked by SRC are set.
351:, and allows an 8×8 block difference to be computed in fewer than seven cycles. One bit of a three-bit immediate operand indicates whether y 1749: 1529: 1435: 419:
Packed 32-bit signed "long" multiplication, two (1st and 3rd) out of four packed integers multiplied giving two packed 64-bit results.
2285: 2249: 1501: 574:
For example, PEXTRD eax, , 1; EXTRACTPS , xmm1, 1 stores the first field of xmm1 in the address given by the first field of xmm0.
1485: 1722: 1779: 944:. These instructions are not available in Intel processors. Support is indicated via the CPUID.80000001H:ECX.SSE4A flag. 537:
Round values in a floating-point register to integers, using one of four rounding modes specified by an immediate operand
1873: 1840: 2151: 1974: 1892: 1663: 1930: 1457: 429:
Packed 32-bit signed "low" multiplication, four packed sets of integers multiplied giving four packed 32-bit results.
2255: 2134: 2010: 1900: 836: 61: 1798: 2273: 2265: 1904: 1360: 1084: 2279: 1207: 711:
product line, and complete the SSE4 instruction set. AMD on the other hand first added support starting with the
149:. Intel credits feedback from developers as playing an important role in the development of the instruction set. 1866: 1201: 1008: 1213: 1195: 712: 166: 1962: 1066: 941: 829: 1102: 1062: 1038: 825: 817: 704: 142: 1720:
Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 2B: Instruction Set Reference, N–Z
857:
called on some CPUs not supporting it, such as Intel CPUs prior to Haswell, may incorrectly execute the
2159: 2109: 2073: 1812: 1154: 1048: 222: 170: 134: 108:; more precise details of 47 instructions became available at the Spring 2007 Intel Developer Forum in 1649: 1399: 2318: 2224: 2180: 2035: 1918: 1379: 1168: 887: 718: 226: 93: 1624:"Microsoft blocks some PCs from Windows 11 24H2 — CPU must support SSE4.2 or the OS will not boot" 2323: 2230: 1935: 1925: 1221: 700: 443: 153: 1998: 1118: 129:
Intel SSE4 consists of 54 instructions. A subset consisting of 47 instructions, referred to as
1742: 1694:"Microsoft fixes a misfired PopCnt block but Windows 11 24H2 requirements may be here to stay" 703:
as used in certain data transfer protocols. These instructions were first implemented in the
2200: 2023: 1521: 345: 101: 2241: 2212: 1957: 8: 2194: 2091: 2085: 916:(count number of bits set to 1). Support is indicated via the CPUID.01H:ECX.POPCNT flag. 996: 141:, a second subset consisting of the seven remaining instructions, is first available in 1140: 926: 744:
C value using the polynomial 0x11EDC6F41 (or, without the high order bit, 0x1EDC6F41).
1850: 2164: 1942: 1628: 1052: 1003: 1482: 2126: 1719: 1307: 2169: 1889: 1726: 1667: 1656: 1557: 1508: 1489: 1386: 1367: 86: 721:
requires the CPU to support SSE4.2, otherwise the Windows kernel is unbootable.
248:
Compute eight offset sums of absolute differences, four at a time (i.e., |x
117:
well as in the presence of existing and new applications that incorporate SSE4.
1597:"XML Parsing Accelerator with Intel® Streaming SIMD Extensions 4 (Intel® SSE4)" 1380:
Tuning for Intel SSE4 for the 45nm Next Generation Intel Core Microarchitecture
913: 39: 2312: 1858: 1028: 708: 1660: 1846: 190: 2188: 1571:"Schema Validation with Intel® Streaming SIMD Extensions 4 (Intel® SSE4)" 105: 1680:
Fast, Parallelized CRC Computation Using the Nehalem CRC32 Instruction
2218: 2140: 1981: 1913: 1551:
Motion Estimation with Intel Streaming SIMD Extensions 4 (Intel SSE4)
643: 367:
should be used from the destination operand, the other two whether x
1679: 1361:
Intel Streaming SIMD Extensions 4 (SSE4) Instruction Set Innovation
1324: 1318: 1312: 1018: 2103: 1986: 1969: 1952: 1332: 1123: 1096: 1092: 1078: 1074: 1056: 146: 109: 97: 2206: 2029: 1698: 853:(bit scan reverse) instruction. This results in an issue where 929:. Support is indicated via the CPUID.80000001H:ECX.ABM flag. 2115: 2053: 1993: 1291: 1277: 1263: 1249: 1235: 741: 348: 182: 90: 2097: 2079: 2065: 2047: 2041: 1947: 1321:
QuadCore C4000-series processors (SSE4.1, SSE4.2 supported)
828:
microarchitecture. AMD implements both, beginning with the
669:
Convert signed DWORDs into unsigned WORDs with saturation.
515:
Packed minimum/maximum for different integer operand types
186: 113: 100:. It was announced on September 27, 2006, at the Fall 2006 82: 715:. Support is indicated via the CPUID.01H:ECX.SSE42 flag. 229:. Support is indicated via the CPUID.01H:ECX.SSE41 flag. 2014: 1502:
Extending the World’s Most Popular Processor Architecture
692: 157: 194:
the SSE4 name for their next instruction set extension.
1813:"AMD FX-Series FX-6300 - FD6300WMW6KHK / FD6300WMHKBOX" 1522:"Intel - Data Center Solutions, IOT, and PC Innovation" 1847:
PCMPSTR calculator for the SSE 4.2 string instructions
1776:""Barcelona" Processor Feature: SSE4a Instruction Set" 1432:""Barcelona" Processor Feature: SSE Misaligned Access" 1184:"Heavy Equipment" processors (SSE4a, SSE4.1, SSE4.2, 774:
Packed Compare Implicit Length Strings, Return Index
754:
Packed Compare Explicit Length Strings, Return Index
1337:
ZX-C processors and newer (SSE4.1, SSE4.2 supported)
940:
The SSE4a instruction group was introduced in AMD's
865:
exception. This is an issue as the result values of
849:
takes the same encoding path as the encoding of the
784:
Packed Compare Implicit Length Strings, Return Mask
764:
Packed Compare Explicit Length Strings, Return Mask
794:Compare Packed Signed 64-bit data For Greater Than 34:
may be too technical for most readers to understand
1495: 2310: 1792: 1773: 1315:3000, X2, QuadCore processors (SSE4.1 supported) 2137:(ABM: 2007, BMI1: 2012, BMI2: 2013, TBM: 2012) 1888: 1795:""Barcelona" Processor Feature: SSE4a, part 2" 894:, otherwise the Windows kernel is unbootable. 344:|); this operation is important for some 1874: 1171:processors and newer (SSE4a, SSE4.1, SSE4.2, 1157:processors and newer (SSE4a, SSE4.1, SSE4.2, 133:in some Intel documentation, is available in 1737: 1735: 1476: 1881: 1867: 628:Packed sign/zero extension to wider types 112:, in the presentation. SSE4 extended the 1732: 1087:processors and newer (SSE4.1, SSE4.2 and 62:Learn how and when to remove this message 46:, without removing the technical details. 1670:for discussion of the CRC32C polynomial. 1458:"Inside Intel Nehalem Microarchitecture" 1327:X4 processors (SSE4.1, SSE4.2 supported) 876:Trailing zeros can be counted using the 659:Quadword (64 bits) compare for equality 221:These instructions were introduced with 2197:(2008); ARMv8 also has AES instructions 1774:Rahul Chaturvedi (September 17, 2007). 2311: 1412:from the original on February 15, 2020 1105:processors and newer (SSE4.1, SSE4.2, 1862: 1755:from the original on November 1, 2013 1621: 1532:from the original on February 7, 2013 1051:processors (SSE4.1 supported, except 985:Scalar streaming store instructions. 44:make it understandable to non-experts 1819:from the original on August 17, 2017 1793:Rahul Chaturvedi (October 2, 2007). 1356: 1354: 1352: 835:AMD calls this pair of instructions 18: 1691: 1294:processors (SSE4a, SSE4.1, SSE4.2, 1280:processors (SSE4a, SSE4.1, SSE4.2, 1266:processors (SSE4a, SSE4.1, SSE4.2, 1252:processors (SSE4a, SSE4.1, SSE4.2, 1238:processors (SSE4a, SSE4.1, SSE4.2, 1224:processors (SSE4a, SSE4.1, SSE4.2, 800: 225:, the 45 nm shrink of Intel's 204: 13: 1603:from the original on June 17, 2018 1577:from the original on June 17, 2018 1464:from the original on April 2, 2015 1400:"Intel SSE4 Programming Reference" 991: 969:Combined mask-shift instructions. 197:Intel is using the marketing term 14: 2335: 1834: 1682:— Dr. Dobbs, April 12, 2011 1349: 642:instruction, in that it sets the 189:Extensions 3), introduced in the 176: 2297:Suspended extensions' dates are 1650:Intel SSE4 Programming Reference 861:operation instead of raising an 399:should be used from the source. 23: 1805: 1786: 1767: 1713: 1685: 1673: 1643: 1622:Klotz, Aaron (April 24, 2024). 1615: 1589: 1563: 1069:processors (SSE4.1, SSE4.2 and 1041:processors (SSE4.1, SSE4.2 and 1031:processors (SSE4.1, SSE4.2 and 1021:processors (SSE4.1, SSE4.2 and 1011:processors (SSE4.1, SSE4.2 and 124: 1544: 1514: 1450: 1424: 1392: 1373: 1: 1692:Sen, Sayan (March 17, 2024). 1343: 890:requires the CPU to support 695:documents. It also added a 7: 1483:My Experience With "Conroe" 1137:"Cat" low-power processors 942:Barcelona microarchitecture 830:Barcelona microarchitecture 713:Bulldozer microarchitecture 79:Streaming SIMD Extensions 4 10: 2340: 1841:SSE4 Programming Reference 1655:February 15, 2020, at the 1507:November 24, 2011, at the 104:, with vague details in a 2295: 2264: 2240: 2178: 2150: 2125: 2009: 1899: 1743:"AMD CPUID Specification" 1488:October 15, 2013, at the 952: 949: 902: 899: 838:Advanced Bit Manipulation 729: 726: 685: 237: 234: 216: 2094:(FMA4: 2011, FMA3: 2012) 935: 701:cyclic redundancy checks 223:Penryn microarchitecture 185:(Supplemental Streaming 16:SIMD CPU instruction set 2152:Compressed instructions 699:instruction to compute 638:This is similar to the 1725:March 8, 2011, at the 1666:June 19, 2008, at the 1556:June 16, 2018, at the 1385:March 8, 2021, at the 880:(bit scan forward) or 820:microarchitecture and 227:Core microarchitecture 94:Core microarchitecture 1434:. AMD. Archived from 1366:May 30, 2009, at the 181:What is now known as 102:Intel Developer Forum 2242:Transactional memory 1801:on October 25, 2013. 1782:on October 25, 2013. 1216:processors and newer 312:|, ..., |x 1143:processors (SSE4a, 1126:processors (SSE4a, 1091:supported, include 863:invalid instruction 824:beginning with the 816:beginning with the 156:-based processors, 1073:supported, except 927:Leading zero count 201:to refer to SSE4. 2306: 2305: 1659:p. 61. See also 1438:on August 9, 2016 1208:Steamroller-based 1053:Pentium Dual-Core 989: 988: 933: 932: 798: 797: 683: 682: 72: 71: 64: 2331: 2319:X86 instructions 2127:Bit manipulation 1883: 1876: 1869: 1860: 1859: 1851:Ghostarchive.org 1829: 1828: 1826: 1824: 1809: 1803: 1802: 1797:. Archived from 1790: 1784: 1783: 1778:. Archived from 1771: 1765: 1764: 1762: 1760: 1754: 1747: 1739: 1730: 1717: 1711: 1710: 1708: 1706: 1689: 1683: 1677: 1671: 1647: 1641: 1640: 1638: 1636: 1619: 1613: 1612: 1610: 1608: 1593: 1587: 1586: 1584: 1582: 1567: 1561: 1548: 1542: 1541: 1539: 1537: 1518: 1512: 1499: 1493: 1480: 1474: 1473: 1471: 1469: 1454: 1448: 1447: 1445: 1443: 1428: 1422: 1421: 1419: 1417: 1411: 1404: 1396: 1390: 1377: 1371: 1358: 1301: 1297: 1287: 1283: 1273: 1269: 1259: 1255: 1245: 1241: 1231: 1227: 1202:Piledriver-based 1191: 1187: 1178: 1174: 1164: 1160: 1150: 1146: 1133: 1129: 1112: 1108: 1090: 1072: 1044: 1034: 1024: 1014: 982: 978: 976: 966: 962: 960: 947: 946: 923: 914:Population count 910: 897: 896: 893: 883: 879: 872: 868: 860: 856: 852: 848: 845:The encoding of 823: 815: 807: 803: 791: 781: 771: 761: 751: 737: 724: 723: 698: 676: 666: 656: 641: 635: 625: 621: 617: 613: 609: 605: 601: 597: 593: 589: 585: 581: 570: 566: 562: 558: 554: 552: 548: 544: 534: 530: 526: 522: 512: 508: 504: 500: 496: 492: 488: 484: 474: 470: 466: 462: 458: 454: 440: 436: 426: 416: 406: 245: 232: 231: 205:New instructions 137:. Additionally, 67: 60: 56: 53: 47: 27: 26: 19: 2339: 2338: 2334: 2333: 2332: 2330: 2329: 2328: 2309: 2308: 2307: 2302: 2291: 2260: 2236: 2174: 2146: 2121: 2005: 1895: 1890:Instruction set 1887: 1853:at May 10, 2022 1837: 1832: 1822: 1820: 1811: 1810: 1806: 1791: 1787: 1772: 1768: 1758: 1756: 1752: 1745: 1741: 1740: 1733: 1727:Wayback Machine 1718: 1714: 1704: 1702: 1690: 1686: 1678: 1674: 1668:Wayback Machine 1657:Wayback Machine 1648: 1644: 1634: 1632: 1620: 1616: 1606: 1604: 1595: 1594: 1590: 1580: 1578: 1569: 1568: 1564: 1558:Wayback Machine 1549: 1545: 1535: 1533: 1520: 1519: 1515: 1509:Wayback Machine 1500: 1496: 1490:Wayback Machine 1481: 1477: 1467: 1465: 1456: 1455: 1451: 1441: 1439: 1430: 1429: 1425: 1415: 1413: 1409: 1402: 1398: 1397: 1393: 1387:Wayback Machine 1378: 1374: 1368:Wayback Machine 1359: 1350: 1346: 1299: 1295: 1285: 1281: 1271: 1267: 1257: 1253: 1243: 1239: 1229: 1225: 1214:Excavator-based 1196:Bulldozer-based 1189: 1185: 1176: 1172: 1162: 1158: 1148: 1144: 1131: 1127: 1110: 1106: 1088: 1070: 1065:processors and 1042: 1032: 1022: 1012: 994: 992:Supporting CPUs 980: 974: 964: 958: 938: 921: 908: 891: 888:Windows 11 24H2 881: 877: 873:are different. 870: 866: 858: 854: 850: 846: 821: 813: 809: 805: 801: 789: 779: 769: 759: 749: 735: 719:Windows 11 24H2 696: 688: 674: 664: 654: 639: 633: 623: 619: 615: 611: 607: 603: 599: 595: 591: 587: 583: 579: 568: 564: 560: 556: 550: 546: 542: 532: 528: 524: 520: 510: 506: 502: 498: 494: 490: 486: 482: 472: 468: 464: 460: 456: 452: 438: 434: 424: 414: 404: 398: 394: 390: 386: 382: 378: 374: 370: 366: 362: 358: 354: 343: 339: 335: 331: 327: 323: 319: 315: 311: 307: 303: 299: 295: 291: 287: 283: 280:|, |x 279: 275: 271: 267: 263: 259: 255: 251: 243: 219: 207: 179: 160:introduced the 127: 87:instruction set 68: 57: 51: 48: 40:help improve it 37: 28: 24: 17: 12: 11: 5: 2337: 2327: 2326: 2324:SIMD computing 2321: 2304: 2303: 2299:struck through 2296: 2293: 2292: 2290: 2289: 2283: 2277: 2270: 2268: 2266:Virtualization 2262: 2261: 2259: 2258: 2253: 2246: 2244: 2238: 2237: 2235: 2234: 2228: 2222: 2216: 2210: 2204: 2198: 2192: 2185: 2183: 2176: 2175: 2173: 2172: 2167: 2162: 2156: 2154: 2148: 2147: 2145: 2144: 2138: 2131: 2129: 2123: 2122: 2120: 2119: 2113: 2107: 2101: 2095: 2089: 2083: 2077: 2071: 2063: 2057: 2051: 2045: 2039: 2033: 2027: 2020: 2018: 2007: 2006: 2004: 2003: 2002: 2001: 1991: 1990: 1989: 1979: 1978: 1977: 1967: 1966: 1965: 1960: 1955: 1950: 1940: 1939: 1938: 1933: 1923: 1922: 1921: 1910: 1908: 1897: 1896: 1886: 1885: 1878: 1871: 1863: 1855: 1854: 1844: 1836: 1835:External links 1833: 1831: 1830: 1804: 1785: 1766: 1731: 1712: 1684: 1672: 1642: 1629:Tom's Hardware 1614: 1588: 1562: 1543: 1513: 1494: 1475: 1449: 1423: 1391: 1372: 1347: 1345: 1342: 1341: 1340: 1339: 1338: 1330: 1329: 1328: 1322: 1316: 1305: 1304: 1303: 1289: 1275: 1261: 1247: 1233: 1219: 1218: 1217: 1211: 1205: 1199: 1182: 1181: 1180: 1166: 1152: 1135: 1116: 1115: 1114: 1100: 1082: 1060: 1046: 1036: 1026: 1016: 993: 990: 987: 986: 983: 971: 970: 967: 955: 954: 951: 937: 934: 931: 930: 924: 918: 917: 911: 905: 904: 901: 884:instructions. 808: 799: 796: 795: 792: 786: 785: 782: 776: 775: 772: 766: 765: 762: 756: 755: 752: 746: 745: 738: 732: 731: 728: 687: 684: 681: 680: 677: 671: 670: 667: 661: 660: 657: 651: 650: 636: 630: 629: 626: 576: 575: 571: 539: 538: 535: 517: 516: 513: 479: 478: 475: 449: 448: 441: 431: 430: 427: 421: 420: 417: 411: 410: 407: 401: 400: 396: 392: 388: 384: 380: 376: 372: 368: 364: 360: 356: 352: 341: 337: 336:|+|x 333: 329: 328:|+|x 325: 321: 320:|+|x 317: 313: 309: 305: 304:|+|x 301: 297: 296:|+|x 293: 289: 288:|+|x 285: 281: 277: 273: 272:|+|x 269: 265: 264:|+|x 261: 257: 256:|+|x 253: 249: 246: 240: 239: 236: 218: 215: 206: 203: 178: 177:Name confusion 175: 152:Starting with 126: 123: 70: 69: 31: 29: 22: 15: 9: 6: 4: 3: 2: 2336: 2325: 2322: 2320: 2317: 2316: 2314: 2300: 2294: 2287: 2284: 2281: 2278: 2275: 2272: 2271: 2269: 2267: 2263: 2257: 2254: 2251: 2248: 2247: 2245: 2243: 2239: 2232: 2229: 2226: 2223: 2220: 2217: 2214: 2211: 2208: 2205: 2202: 2199: 2196: 2193: 2190: 2187: 2186: 2184: 2182: 2179:Security and 2177: 2171: 2168: 2166: 2163: 2161: 2158: 2157: 2155: 2153: 2149: 2142: 2139: 2136: 2133: 2132: 2130: 2128: 2124: 2117: 2114: 2111: 2108: 2105: 2102: 2099: 2096: 2093: 2090: 2087: 2084: 2081: 2078: 2075: 2072: 2070: 2067: 2064: 2061: 2058: 2055: 2052: 2049: 2046: 2043: 2040: 2037: 2034: 2031: 2028: 2025: 2022: 2021: 2019: 2016: 2012: 2008: 2000: 1997: 1996: 1995: 1992: 1988: 1985: 1984: 1983: 1980: 1976: 1973: 1972: 1971: 1968: 1964: 1961: 1959: 1956: 1954: 1951: 1949: 1946: 1945: 1944: 1941: 1937: 1934: 1932: 1929: 1928: 1927: 1924: 1920: 1917: 1916: 1915: 1912: 1911: 1909: 1906: 1902: 1898: 1894: 1891: 1884: 1879: 1877: 1872: 1870: 1865: 1864: 1861: 1857: 1852: 1848: 1845: 1842: 1839: 1838: 1818: 1814: 1808: 1800: 1796: 1789: 1781: 1777: 1770: 1751: 1744: 1738: 1736: 1728: 1724: 1721: 1716: 1701: 1700: 1695: 1688: 1681: 1676: 1669: 1665: 1662: 1658: 1654: 1651: 1646: 1631: 1630: 1625: 1618: 1602: 1598: 1592: 1576: 1572: 1566: 1559: 1555: 1552: 1547: 1536:September 17, 1531: 1527: 1523: 1517: 1510: 1506: 1503: 1498: 1491: 1487: 1484: 1479: 1463: 1459: 1453: 1437: 1433: 1427: 1408: 1401: 1395: 1388: 1384: 1381: 1376: 1369: 1365: 1362: 1357: 1355: 1353: 1348: 1336: 1335: 1334: 1331: 1326: 1323: 1320: 1317: 1314: 1311: 1310: 1309: 1306: 1293: 1290: 1279: 1276: 1265: 1262: 1251: 1248: 1237: 1234: 1223: 1220: 1215: 1212: 1209: 1206: 1203: 1200: 1197: 1194: 1193: 1183: 1170: 1167: 1156: 1153: 1142: 1139: 1138: 1136: 1125: 1122: 1121: 1120: 1117: 1104: 1101: 1098: 1094: 1086: 1083: 1080: 1076: 1068: 1064: 1061: 1058: 1054: 1050: 1047: 1040: 1037: 1030: 1029:Goldmont Plus 1027: 1020: 1017: 1010: 1007: 1006: 1005: 1002: 1001: 1000: 998: 984: 973: 972: 968: 957: 956: 948: 945: 943: 928: 925: 920: 919: 915: 912: 907: 906: 898: 895: 889: 885: 874: 864: 843: 841: 839: 833: 831: 827: 819: 793: 788: 787: 783: 778: 777: 773: 768: 767: 763: 758: 757: 753: 748: 747: 743: 739: 734: 733: 725: 722: 720: 716: 714: 710: 709:Intel Core i7 706: 702: 694: 678: 673: 672: 668: 663: 662: 658: 653: 652: 649: 645: 637: 632: 631: 627: 578: 577: 572: 569:PEXTRD/PEXTRQ 541: 540: 536: 519: 518: 514: 481: 480: 476: 451: 450: 445: 442: 433: 432: 428: 423: 422: 418: 413: 412: 408: 403: 402: 350: 347: 247: 242: 241: 233: 230: 228: 224: 214: 211: 202: 200: 195: 192: 188: 184: 174: 172: 168: 163: 159: 155: 150: 148: 144: 140: 136: 132: 122: 118: 115: 111: 107: 103: 99: 98:AMD K10 (K8L) 95: 92: 88: 84: 80: 76: 66: 63: 55: 45: 41: 35: 32:This article 30: 21: 20: 2298: 2181:cryptography 2068: 2059: 1856: 1849:archived at 1821:. Retrieved 1807: 1799:the original 1788: 1780:the original 1769: 1757:. Retrieved 1715: 1703:. Retrieved 1697: 1687: 1675: 1645: 1633:. Retrieved 1627: 1617: 1605:. Retrieved 1591: 1579:. Retrieved 1565: 1546: 1534:. Retrieved 1525: 1516: 1497: 1478: 1466:. Retrieved 1452: 1440:. Retrieved 1436:the original 1426: 1416:December 26, 1414:. Retrieved 1394: 1375: 1155:Jaguar-based 1141:Bobcat-based 1085:Sandy Bridge 995: 953:Description 950:Instruction 939: 903:Description 900:Instruction 886: 875: 862: 844: 837: 834: 810: 730:Description 727:Instruction 717: 689: 647: 238:Description 235:Instruction 220: 212: 208: 198: 196: 191:Intel Core 2 180: 161: 151: 138: 130: 128: 125:SSE4 subsets 119: 89:used in the 78: 74: 73: 58: 49: 33: 2165:MIPS16e ASE 1759:October 30, 1607:February 6, 1581:February 6, 1492:, DailyTech 1192:supported) 812:implements 740:Accumulate 444:Dot product 106:white paper 2313:Categories 1893:extensions 1823:October 9, 1344:References 1302:supported) 1292:Zen5-based 1288:supported) 1278:Zen4-based 1274:supported) 1264:Zen3-based 1260:supported) 1250:Zen2-based 1246:supported) 1236:Zen+-based 1232:supported) 1210:processors 1204:processors 1198:processors 1179:supported) 1169:Puma-based 1165:supported) 1151:supported) 1134:supported) 1113:supported) 1045:supported) 1035:supported) 1025:supported) 1015:supported) 1009:Silvermont 405:PHMINPOSUW 1982:Power ISA 1963:MIPS SIMD 1705:March 17, 1635:April 29, 1222:Zen-based 1124:K10-based 997:X86-64 v2 780:PCMPISTRM 770:PCMPISTRI 760:PCMPESTRM 750:PCMPESTRI 561:EXTRACTPS 167:Bulldozer 154:Barcelona 52:July 2019 2288:(AMD-Vi) 1843:by Intel 1817:Archived 1750:Archived 1723:Archived 1664:Archived 1661:RFC 3385 1653:Archived 1601:Archived 1575:Archived 1560:, Intel. 1554:Archived 1530:Archived 1505:Archived 1486:Archived 1468:March 3, 1462:Archived 1442:March 3, 1407:Archived 1389:, Intel. 1383:Archived 1370:, Intel. 1364:Archived 1067:Westmere 1019:Goldmont 675:MOVNTDQA 665:PACKUSDW 624:PMOVZXDQ 620:PMOVSXDQ 616:PMOVZXWQ 612:PMOVSXWQ 608:PMOVZXWD 604:PMOVSXWD 600:PMOVZXBQ 596:PMOVSXBQ 592:PMOVZXBD 588:PMOVSXBD 584:PMOVZXBW 580:PMOVSXBW 543:INSERTPS 469:PBLENDVB 465:BLENDVPD 461:BLENDVPS 199:HD Boost 2189:PadLock 2104:AVX-512 1970:PA-RISC 1953:MIPS-3D 1511:, Intel 1333:Zhaoxin 1103:Haswell 1097:Celeron 1093:Pentium 1079:Celeron 1075:Pentium 1063:Nehalem 1057:Celeron 1039:Tremont 981:MOVNTSS 979:​ 975:MOVNTSD 965:INSERTQ 963:​ 826:Haswell 818:Nehalem 790:PCMPGTQ 707:-based 705:Nehalem 655:PCMPEQQ 555:​ 533:ROUNDSD 529:ROUNDPD 525:ROUNDSS 521:ROUNDPS 473:PBLENDW 457:BLENDPD 453:BLENDPS 244:MPSADBW 147:Core i7 145:-based 143:Nehalem 110:Beijing 81:) is a 38:Please 2282:(2006) 2276:(2005) 2252:(2013) 2233:(2021) 2227:(2015) 2221:(2015) 2215:(2013) 2209:(2012) 2207:RDRAND 2203:(2010) 2195:AES-NI 2191:(2003) 2143:(2014) 2118:(2023) 2112:(2022) 2106:(2015) 2100:(2013) 2088:(2009) 2082:(2009) 2076:(2008) 2069:(2007) 2062:(2006) 2056:(2006) 2050:(2004) 2044:(2001) 2038:(1999) 2032:(1998) 2030:3DNow! 2026:(1996) 1699:Neowin 1296:POPCNT 1282:POPCNT 1268:POPCNT 1254:POPCNT 1240:POPCNT 1226:POPCNT 1186:POPCNT 1173:POPCNT 1159:POPCNT 1145:POPCNT 1128:POPCNT 1107:POPCNT 1089:POPCNT 1071:POPCNT 1049:Penryn 1043:POPCNT 1033:POPCNT 1023:POPCNT 1013:POPCNT 999:CPUs: 909:POPCNT 892:POPCNT 814:POPCNT 802:POPCNT 686:SSE4.2 644:Z flag 565:PEXTRB 557:PINSRQ 551:PINSRD 547:PINSRB 511:PMAXSD 507:PMINSD 503:PMAXUD 499:PMINUD 495:PMAXUW 491:PMINUW 487:PMAXSB 483:PMINSB 425:PMULLD 415:PMULDQ 349:codecs 217:SSE4.1 139:SSE4.2 135:Penryn 131:SSE4.1 2280:AMD-V 2201:CLMUL 2160:Thumb 2116:AVX10 2054:SSSE3 1994:SPARC 1914:Alpha 1753:(PDF) 1746:(PDF) 1526:Intel 1410:(PDF) 1403:(PDF) 1300:LZCNT 1286:LZCNT 1272:LZCNT 1258:LZCNT 1244:LZCNT 1230:LZCNT 1190:LZCNT 1177:LZCNT 1163:LZCNT 1149:LZCNT 1132:LZCNT 1111:LZCNT 1004:Intel 959:EXTRQ 936:SSE4a 922:LZCNT 882:TZCNT 867:LZCNT 855:LZCNT 847:LZCNT 840:(ABM) 822:LZCNT 806:LZCNT 742:CRC32 736:CRC32 697:CRC32 634:PTEST 183:SSSE3 162:SSE4a 91:Intel 2286:VT-d 2274:VT-x 2098:AVX2 2080:F16C 2066:SSE5 2060:SSE4 2048:SSE3 2042:SSE2 2011:SIMD 1948:MDMX 1943:MIPS 1931:NEON 1905:RISC 1901:SIMD 1825:2015 1761:2013 1707:2024 1637:2024 1609:2012 1583:2012 1538:2009 1470:2015 1444:2015 1418:2014 1325:Eden 1319:Nano 1313:Nano 1298:and 1284:and 1270:and 1256:and 1242:and 1228:and 1188:and 1175:and 1161:and 1147:and 1130:and 1109:and 1095:and 1077:and 1055:and 869:and 804:and 640:TEST 439:DPPD 435:DPPS 391:or x 363:.. y 359:or y 355:.. y 187:SIMD 114:SSE3 96:and 85:CPU 83:SIMD 75:SSE4 2256:ASF 2250:TSX 2231:TDX 2225:SGX 2219:MPX 2213:SHA 2170:RVC 2141:ADX 2135:BMI 2110:AMX 2092:FMA 2086:XOP 2074:AVX 2036:SSE 2024:MMX 2015:x86 1999:VIS 1987:VMX 1975:MAX 1958:MXU 1936:SVE 1926:ARM 1919:MVI 1308:VIA 1119:AMD 878:BSF 871:BSR 859:BSR 851:BSR 693:XML 395:..x 387:..x 383:, x 379:..x 375:, x 371:..x 171:AVX 158:AMD 42:to 2315:: 1815:. 1748:. 1734:^ 1696:. 1626:. 1599:. 1573:. 1528:. 1524:. 1460:. 1405:. 1351:^ 842:. 832:. 622:, 618:, 614:, 610:, 606:, 602:, 598:, 594:, 590:, 586:, 582:, 567:, 563:, 559:, 549:, 545:, 531:, 527:, 523:, 509:, 505:, 501:, 497:, 493:, 489:, 485:, 471:, 467:, 463:, 459:, 455:, 437:, 397:15 393:12 389:11 365:14 357:10 346:HD 342:10 340:−y 332:−y 324:−y 316:−y 308:−y 300:−y 292:−y 284:−y 276:−y 268:−y 260:−y 252:−y 173:. 2301:. 2017:) 2013:( 1907:) 1903:( 1882:e 1875:t 1868:v 1827:. 1763:. 1729:. 1709:. 1639:. 1611:. 1585:. 1540:. 1472:. 1446:. 1420:. 1099:) 1081:) 1059:) 977:/ 961:/ 553:/ 385:8 381:7 377:4 373:3 369:0 361:4 353:0 338:3 334:9 330:2 326:8 322:1 318:7 314:0 310:4 306:3 302:3 298:2 294:2 290:1 286:1 282:0 278:3 274:3 270:2 266:2 262:1 258:1 254:0 250:0 77:( 65:) 59:( 54:) 50:( 36:.

Index

help improve it
make it understandable to non-experts
Learn how and when to remove this message
SIMD
instruction set
Intel
Core microarchitecture
AMD K10 (K8L)
Intel Developer Forum
white paper
Beijing
SSE3
Penryn
Nehalem
Core i7
Barcelona
AMD
Bulldozer
AVX
SSSE3
SIMD
Intel Core 2
Penryn microarchitecture
Core microarchitecture
HD
codecs
Dot product
Z flag
XML
cyclic redundancy checks

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.