Knowledge

Tomasulo's algorithm

Source 📝

899: 99: 137: 237:, where the specific instruction that took the exception can be determined, can restart or re-execute at the point of the exception. However, those that experience imprecise exceptions generally cannot restart or re-execute, as the system cannot determine the specific instruction that took the exception. 205:
to correctly perform out-of-order execution. All general-purpose and reservation station registers hold either a real value or a placeholder value. If a real value is unavailable to a destination register during the issue stage, a placeholder value is initially used. The placeholder value is a tag
1373:
Reservation stations take on the responsibility of waiting for operands in the presence of data dependencies and other inconsistencies such as varying storage access time and circuit speeds, thus freeing up the functional units. This improvement overcomes long floating point delays and memory
209:
Each functional unit has a single reservation station. Reservation stations hold information needed to execute a single instruction, including the operation and the operands. The functional unit begins processing when it is free and when all source operands needed for an instruction are real.
1374:
accesses. In particular the algorithm is more tolerant of cache misses. Additionally, programmers are freed from implementing optimized code. This is a result of the common data bus and reservation station working together to preserve dependencies as well as encouraging concurrency.
911:
In the execute stage, the instruction operations are carried out. Instructions are delayed in this step until all of their operands are available, eliminating RAW hazards. Program correctness is maintained through effective address calculation to prevent hazards through memory.
1421:
Proliferation of mass-market software meant that programmers would not want to compile for a specific pipeline structure. The algorithm can function with any pipeline architecture and thus software requires few architecture-specific modifications.
118:
Functional units can access the result of any operation without involving a floating-point-register, allowing multiple units waiting on a result to proceed without waiting to resolve contention for access to register file read
1558: 334:
Otherwise, we can assume the operands are not in the registers, and so use virtual values. The functional unit must calculate the real value to keep track of the functional units that produce the operand.
318:
In the issue stage, instructions are issued for execution if all operands and reservation stations are ready or else they are stalled. Registers are renamed in this step, eliminating WAR and WAW hazards.
1411:
Tomasulo's algorithm was implemented in the System/360 Model 91 architecture. Outside of IBM, it went unused for several years. However, it saw a vast increase in usage during the 1990s for 3 reasons:
114:
The Common Data Bus (CDB) connects reservation stations directly to functional units. According to Tomasulo it "preserves precedence while encouraging concurrency". This has two important effects:
218:
Practically speaking, there may be exceptions for which not enough status information about an exception is available, in which case the processor may raise a special exception, called an
1370:
The concepts of reservation stations, register renaming, and the common data bus in Tomasulo's algorithm presents significant advancements in the design of high-performance computers.
193:
raised by these instructions, occur in the same order as they would on an in-order processor, regardless of the fact that they are being executed out-of-order (i.e. non-sequentially).
206:
indicating which reservation station will produce the real value. When the unit finishes and broadcasts the result on the CDB, the placeholder will be replaced with the real value.
1399:
An equally important improvement in the algorithm is the design is not limited to a specific pipeline structure. This improvement allows the algorithm to be more widely adopted by
72:
for all execution units, and a common data bus (CDB) on which computed values broadcast to all reservation stations that may need them. These developments allow for improved
1415:
Once caches became commonplace, the algorithm's ability to maintain concurrency during unpredictable load times caused by cache misses became valuable in processors.
122:
Hazard Detection and control execution are distributed. The reservation stations control when an instruction can execute, rather than a single dedicated hazard unit.
1617: 1699: 245:
The three stages listed below are the stages through which each instruction passes from the time it is issued to the time its execution is complete.
1024:
If the result is available, then: write it on the CDB and from there into the registers and any reservation stations waiting for this result
1418:
Dynamic scheduling and branch speculation from the algorithm enables improved performance as processors issued more and more instructions.
323:
Retrieve the next instruction from the head of the instruction queue. If the instruction operands are currently in the registers, then
309:
Qi - the reservation station whose result should be stored in this register (if blank or 0, no values are destined for this register)
1382: 1426:
Many modern processors implement dynamic scheduling schemes that are variants of Tomasulo's original algorithm, including popular
1017:
In the write Result stage, ALU operations results are written back to registers and store operations are written back to memory.
1386: 154: 1377:
By tracking operands for instructions in the reservation stations and register renaming in hardware the algorithm minimizes
1572: 1640: 176: 1678: 290:
Qj, Qk - the reservation station that will produce the relevant source operand (0 indicates the value is in Vj, Vk)
1729: 1393: 1588: 928:
Else, if the instruction is a store then: wait for the value to be stored before sending it to the memory unit
1448: 916:
If one or more of the operands is not yet available then: wait for operand to become available on the CDB.
922:
Compute the effective address when the base register is available, and place it in the load/store buffer
329:
Else, as there is no available functional unit, stall the instruction until a station or buffer is free.
227: 158: 147: 1378: 87: 17: 1641:"Differences between Tomasulo's algorithm and dynamic scheduling in Intel Core microarchitecture" 1396:. This improves performance by reducing wasted time that would otherwise be required for stalls. 54: 189:
Instructions are issued sequentially so that the effects of a sequence of instructions, such as
1709: 223: 42: 1476: 1390: 936: 190: 34: 1522: 1403:
processors. Additionally, the algorithm is easily extended to enable branch speculation.
8: 1724: 69: 58: 106:
The following are the concepts necessary to the implementation of Tomasulo's algorithm:
1504: 73: 1568: 1496: 202: 65: 1029:
Else, if the instruction was a store then: write the data to memory during this step
898: 1508: 1488: 939:(ALU) operation then: execute the instruction at the corresponding functional unit 925:
If the instruction is a load then: execute as soon as the memory unit is available
1703: 1442: 83: 46: 45:
and enables more efficient use of multiple execution units. It was developed by
1479:(Jan 1967). "An Efficient Algorithm for Exploiting Multiple Arithmetic Units". 1400: 1718: 1500: 919:
When all operands are available, then: if the instruction is a load or store
77: 226:
implementations, as processor state is changed only in program order (see
1670: 257:
RegisterStat - Register Status; contains information about the registers.
1492: 98: 277:
r - reservation station or buffer that the instruction is assigned to
38: 1564: 296:
A - used to hold the memory address information for a load or store
326:
If a matching functional unit is available, issue the instruction.
1430: 1427: 1618:
Intel 64 and IA-32 Architectures Software Developer's Manual
76:
of instructions that would otherwise stall under the use of
287:
Op - represents the operation being performed on operands
50: 1710:
HASE Java applet simulation of the Tomasulo's algorithm
64:
The major innovations of Tomasulo’s algorithm include
157:. Please help to ensure that disputed statements are 41:for dynamic scheduling of instructions that allows 1556: 1716: 1557:Hennessy, John L.; Patterson, David A. (2012). 1560:Computer Architecture: A Quantitative Approach 281: 1552: 1550: 1548: 1546: 1544: 1471: 1469: 1467: 1465: 1406: 971:Compute result: operands are in Vj and Vk 93: 1700:Dynamic Scheduling - Tomasulo's Algorithm 1365: 303: 293:Vj, Vk - the value of the source operands 177:Learn how and when to remove this message 53:in 1967 and was first implemented in the 1475: 1021:If the instruction was an ALU operation 1012: 897: 240: 153:Relevant discussion may be found on the 97: 27:Computer architecture hardware algorithm 1541: 1481:IBM Journal of Research and Development 1462: 299:Busy - 1 if occupied, 0 if not occupied 228:Classic RISC pipeline § Exceptions 222:. Imprecise exceptions cannot occur in 90:in 1997 for his work on the algorithm. 14: 1717: 1671:"Pipelined and Out-of-Order Execution" 1668: 1632: 1610: 983:& r is head of load-store queue 271:rs, rt - source registration numbers 196: 130: 126: 906: 274:imm - sign extended immediate field 24: 1662: 263:Mem - Value of memory at address A 109: 25: 1741: 1706: (archived December 25, 2017) 1693: 313: 1638: 1523:"Robert Tomasulo – Award Winner" 268:rd - destination register number 135: 1681:from the original on 2018-07-03 1620:(Report). Intel. September 2014 902:Example of Tomasulo's algorithm 1581: 1515: 102:Tomasulo's floating point unit 13: 1: 1598:. Washington University. 2006 1455: 1449:Instruction-level parallelism 213: 80:or other earlier algorithms. 1669:Savard, John J. G. (2018) . 965:(RS.Qj = 0) and (RS.Qk = 0) 935:Else, the instruction is an 7: 1436: 10: 1746: 282:Reservation Station Fields 260:regs - Value of register x 201:Tomasulo's algorithm uses 248: 233:Programs that experience 1311: 1060: 986: 963: 793: 762: 614: 365: 1407:Applications and legacy 254:RS - Reservation Status 94:Implementation concepts 55:IBM System/360 Model 91 1730:Instruction processing 1477:Tomasulo, Robert Marco 1366:Algorithm improvements 1305:Execution complete at 1053:Execution complete at 1045:Action or bookkeeping 955:Action or bookkeeping 903: 350:Action or bookkeeping 304:Register Status Fields 103: 43:out-of-order execution 1589:"CSE P548 - Tomasulo" 1391:computer architecture 1381:(RAW) and eliminates 1050:FP operation or load 1013:Stage 3: write result 998:Load step 1 complete 988:RS.A ← RS.Vj + RS.A; 937:arithmetic logic unit 901: 241:Instruction lifecycle 101: 35:computer architecture 1057:& CDB available 146:factual accuracy is 88:Eckert–Mauchly Award 70:reservation stations 31:Tomasulo's algorithm 1493:10.1147/rd.111.0025 1035: 945: 340: 220:imprecise exception 59:floating point unit 1039:Instruction state 1033: 977:Load/store step 1 949:Instruction state 943: 904: 344:Instruction state 338: 235:precise exceptions 104: 74:parallel execution 1487:(1). IBM: 25–33. 1383:write-after-write 1363: 1362: 1307:r & RS.Qk = 0 1010: 1009: 896: 895: 203:register renaming 197:Register renaming 187: 186: 179: 127:Instruction order 66:register renaming 16:(Redirected from 1737: 1689: 1687: 1686: 1656: 1655: 1653: 1651: 1636: 1630: 1629: 1627: 1625: 1614: 1608: 1607: 1605: 1603: 1593: 1585: 1579: 1578: 1554: 1539: 1538: 1536: 1534: 1519: 1513: 1512: 1473: 1387:Write-after-Read 1379:read-after-write 1357: 1354: 1351: 1348: 1345: 1342: 1339: 1336: 1333: 1330: 1327: 1324: 1321: 1318: 1315: 1308: 1295: 1292: 1289: 1286: 1283: 1280: 1277: 1274: 1271: 1268: 1265: 1262: 1259: 1256: 1253: 1250: 1247: 1244: 1241: 1238: 1235: 1232: 1229: 1226: 1223: 1220: 1217: 1214: 1211: 1208: 1205: 1202: 1199: 1196: 1193: 1190: 1187: 1184: 1181: 1178: 1175: 1172: 1169: 1166: 1163: 1160: 1157: 1154: 1151: 1148: 1145: 1142: 1139: 1136: 1133: 1130: 1127: 1124: 1121: 1118: 1115: 1112: 1109: 1106: 1103: 1100: 1097: 1094: 1091: 1088: 1085: 1082: 1079: 1076: 1073: 1070: 1067: 1064: 1056: 1036: 1032: 1005: 982: 946: 942: 907:Stage 2: execute 890: 887: 884: 881: 878: 875: 872: 869: 866: 863: 860: 857: 854: 851: 848: 845: 842: 839: 836: 833: 830: 827: 824: 821: 818: 815: 812: 809: 806: 803: 800: 797: 781: 778: 775: 772: 769: 766: 750: 747: 744: 741: 738: 735: 732: 729: 726: 723: 720: 717: 714: 711: 708: 705: 702: 699: 696: 693: 690: 687: 684: 681: 678: 675: 672: 669: 666: 663: 660: 657: 654: 651: 648: 645: 642: 639: 636: 633: 630: 627: 624: 621: 618: 610: 597: 594: 591: 588: 585: 582: 579: 576: 573: 570: 567: 564: 561: 558: 555: 552: 549: 546: 543: 540: 537: 534: 531: 528: 525: 522: 519: 516: 513: 510: 507: 504: 501: 498: 495: 492: 489: 486: 483: 480: 477: 474: 471: 468: 465: 462: 459: 456: 453: 450: 447: 444: 441: 438: 435: 432: 429: 426: 423: 420: 417: 414: 411: 408: 405: 402: 399: 396: 393: 390: 387: 384: 381: 378: 375: 372: 369: 361: 341: 337: 182: 175: 171: 168: 162: 159:reliably sourced 139: 138: 131: 21: 1745: 1744: 1740: 1739: 1738: 1736: 1735: 1734: 1715: 1714: 1704:Wayback Machine 1696: 1684: 1682: 1665: 1663:Further reading 1660: 1659: 1649: 1647: 1637: 1633: 1623: 1621: 1616: 1615: 1611: 1601: 1599: 1591: 1587: 1586: 1582: 1575: 1563:. Waltham, MA: 1555: 1542: 1532: 1530: 1521: 1520: 1516: 1474: 1463: 1458: 1443:Re-order buffer 1439: 1409: 1368: 1359: 1358: 1355: 1352: 1349: 1346: 1343: 1340: 1337: 1334: 1331: 1328: 1325: 1322: 1319: 1316: 1313: 1306: 1297: 1296: 1293: 1290: 1287: 1284: 1281: 1278: 1275: 1272: 1269: 1266: 1263: 1260: 1257: 1254: 1251: 1248: 1245: 1242: 1239: 1236: 1233: 1230: 1227: 1224: 1221: 1218: 1215: 1212: 1209: 1206: 1203: 1200: 1197: 1194: 1191: 1188: 1185: 1182: 1179: 1176: 1173: 1170: 1167: 1164: 1161: 1158: 1155: 1152: 1149: 1146: 1143: 1140: 1137: 1134: 1131: 1128: 1125: 1122: 1119: 1116: 1113: 1110: 1107: 1104: 1101: 1098: 1095: 1092: 1089: 1086: 1083: 1080: 1077: 1074: 1071: 1068: 1065: 1062: 1054: 1015: 1003: 990: 989: 980: 967: 966: 909: 892: 891: 888: 885: 882: 879: 876: 873: 870: 867: 864: 861: 858: 855: 852: 849: 846: 843: 840: 837: 834: 831: 828: 825: 822: 819: 816: 813: 810: 807: 804: 801: 798: 795: 783: 782: 779: 776: 773: 770: 767: 764: 752: 751: 748: 745: 742: 739: 736: 733: 730: 727: 724: 721: 718: 715: 712: 709: 706: 703: 700: 697: 694: 691: 688: 685: 682: 679: 676: 673: 670: 667: 664: 661: 658: 655: 652: 649: 646: 643: 640: 637: 634: 631: 628: 625: 622: 619: 616: 608: 599: 598: 595: 592: 589: 586: 583: 580: 577: 574: 571: 568: 565: 562: 559: 556: 553: 550: 547: 544: 541: 538: 535: 532: 529: 526: 523: 520: 517: 514: 511: 508: 505: 502: 499: 496: 493: 490: 487: 484: 481: 478: 475: 472: 469: 466: 463: 460: 457: 454: 451: 448: 445: 442: 439: 436: 433: 430: 427: 424: 421: 418: 415: 412: 409: 406: 403: 400: 397: 394: 391: 388: 385: 382: 379: 376: 373: 370: 367: 359: 316: 306: 284: 251: 243: 216: 199: 183: 172: 166: 163: 152: 144:This section's 140: 136: 129: 112: 110:Common data bus 96: 84:Robert Tomasulo 47:Robert Tomasulo 28: 23: 22: 15: 12: 11: 5: 1743: 1733: 1732: 1727: 1713: 1712: 1707: 1695: 1694:External links 1692: 1691: 1690: 1664: 1661: 1658: 1657: 1639:Yoga, Adarsh. 1631: 1609: 1596:washington.edu 1580: 1574:978-0123838728 1573: 1540: 1514: 1460: 1459: 1457: 1454: 1453: 1452: 1446: 1438: 1435: 1424: 1423: 1419: 1416: 1408: 1405: 1401:multiple-issue 1367: 1364: 1361: 1360: 1312: 1309: 1303: 1299: 1298: 1061: 1058: 1051: 1047: 1046: 1043: 1040: 1031: 1030: 1027: 1026: 1025: 1014: 1011: 1008: 1007: 999: 996: 992: 991: 987: 984: 978: 974: 973: 968: 964: 961: 957: 956: 953: 950: 941: 940: 933: 932: 931: 930: 929: 926: 917: 908: 905: 894: 893: 794: 791: 789: 785: 784: 763: 760: 758: 754: 753: 615: 612: 605: 604:Load or Store 601: 600: 366: 363: 356: 352: 351: 348: 345: 336: 335: 332: 331: 330: 327: 315: 314:Stage 1: issue 312: 311: 310: 305: 302: 301: 300: 297: 294: 291: 288: 283: 280: 279: 278: 275: 272: 269: 265: 264: 261: 258: 255: 250: 247: 242: 239: 215: 212: 198: 195: 185: 184: 143: 141: 134: 128: 125: 124: 123: 120: 111: 108: 95: 92: 26: 9: 6: 4: 3: 2: 1742: 1731: 1728: 1726: 1723: 1722: 1720: 1711: 1708: 1705: 1701: 1698: 1697: 1680: 1676: 1672: 1667: 1666: 1646: 1642: 1635: 1619: 1613: 1597: 1590: 1584: 1576: 1570: 1566: 1562: 1561: 1553: 1551: 1549: 1547: 1545: 1528: 1524: 1518: 1510: 1506: 1502: 1498: 1494: 1490: 1486: 1482: 1478: 1472: 1470: 1468: 1466: 1461: 1450: 1447: 1444: 1441: 1440: 1434: 1432: 1429: 1420: 1417: 1414: 1413: 1412: 1404: 1402: 1397: 1395: 1392: 1388: 1384: 1380: 1375: 1371: 1310: 1304: 1301: 1300: 1059: 1052: 1049: 1048: 1044: 1041: 1038: 1037: 1028: 1023: 1022: 1020: 1019: 1018: 1006: 1000: 997: 994: 993: 985: 979: 976: 975: 972: 969: 962: 960:FP operation 959: 958: 954: 951: 948: 947: 938: 934: 927: 924: 923: 921: 920: 918: 915: 914: 913: 900: 792: 790: 787: 786: 761: 759: 756: 755: 613: 606: 603: 602: 364: 357: 355:FP operation 354: 353: 349: 346: 343: 342: 333: 328: 325: 324: 322: 321: 320: 308: 307: 298: 295: 292: 289: 286: 285: 276: 273: 270: 267: 266: 262: 259: 256: 253: 252: 246: 238: 236: 231: 229: 225: 221: 211: 207: 204: 194: 192: 181: 178: 170: 167:December 2023 160: 156: 150: 149: 142: 133: 132: 121: 117: 116: 115: 107: 100: 91: 89: 86:received the 85: 81: 79: 78:scoreboarding 75: 71: 68:in hardware, 67: 62: 60: 56: 52: 48: 44: 40: 36: 32: 19: 1683:. Retrieved 1674: 1648:. Retrieved 1644: 1634: 1622:. Retrieved 1612: 1600:. Retrieved 1595: 1583: 1559: 1531:. Retrieved 1526: 1517: 1484: 1480: 1425: 1410: 1398: 1376: 1372: 1369: 1111:RegisterStat 1078:RegisterStat 1034:Pseudocode 1016: 1001: 995:Load step 2 970: 944:Pseudocode 910: 835:RegisterStat 802:RegisterStat 765:RegisterStat 656:RegisterStat 623:RegisterStat 581:RegisterStat 503:RegisterStat 470:RegisterStat 407:RegisterStat 374:RegisterStat 317: 244: 234: 232: 219: 217: 208: 200: 188: 173: 164: 145: 113: 105: 82: 63: 30: 29: 1645:The boozier 1042:Wait until 952:Wait until 788:Store only 347:Wait until 339:Pseudocode 1725:Algorithms 1719:Categories 1685:2018-07-16 1624:8 December 1602:8 December 1533:8 December 1527:ACM Awards 1456:References 1385:(WAW) and 1002:Read from 757:Load only 214:Exceptions 191:exceptions 1675:quadibloc 1501:0018-8646 981:RS.Qj = 0 155:talk page 39:algorithm 37:hardware 1679:Archived 1565:Elsevier 1437:See also 358:Station 224:in-order 148:disputed 18:Tomasulo 1702:at the 1650:4 April 1509:8445049 1433:chips. 1394:hazards 607:Buffer 1571:  1507:  1499:  1431:x86-64 1389:(WAR) 1302:Store 1252:result 1177:result 1105:result 1004:Mem.A] 611:empty 362:empty 249:Legend 119:ports. 1592:(PDF) 1529:. ACM 1505:S2CID 1451:(ILP) 1445:(ROB) 1428:Intel 33:is a 1652:2016 1626:2014 1604:2014 1569:ISBN 1535:2014 1497:ISSN 1347:Busy 1285:Busy 1099:regs 868:Regs 850:else 740:Busy 689:Regs 671:else 569:Busy 536:Regs 518:else 437:Regs 419:else 1489:doi 1314:Mem 1276:}); 1201:}); 1126:}); 746:yes 728:imm 575:yes 230:). 57:’s 51:IBM 49:at 1721:: 1677:. 1673:. 1643:. 1594:. 1567:. 1543:^ 1525:. 1503:. 1495:. 1485:11 1483:. 1464:^ 1353:no 1341:RS 1335:Vk 1329:RS 1291:no 1279:RS 1264:Qk 1258:RS 1246:Vk 1240:RS 1225:Qk 1219:RS 1213:if 1189:Qj 1183:RS 1171:Vj 1165:RS 1150:Qj 1144:RS 1138:if 1117:Qi 1084:Qi 1072:if 889:}; 880:Qk 874:RS 862:Vk 856:RS 841:Qi 829:Qk 823:RS 808:Qi 796:if 771:Qi 734:RS 716:RS 701:Qj 695:RS 683:Vj 677:RS 662:Qi 650:Qj 644:RS 629:Qi 617:if 587:Qi 563:RS 548:Qk 542:RS 530:Vk 524:RS 509:Qi 497:Qk 491:RS 476:Qi 464:if 449:Qj 443:RS 431:Vj 425:RS 413:Qi 401:Qj 395:RS 380:Qi 368:if 61:. 1688:. 1654:. 1628:. 1606:. 1577:. 1537:. 1511:. 1491:: 1356:; 1350:← 1344:. 1338:; 1332:. 1326:← 1323:] 1320:A 1317:. 1294:; 1288:← 1282:. 1273:; 1270:0 1267:← 1261:. 1255:; 1249:← 1243:. 1237:{ 1234:) 1231:r 1228:= 1222:. 1216:( 1210:( 1207:x 1204:∀ 1198:; 1195:0 1192:← 1186:. 1180:; 1174:← 1168:. 1162:{ 1159:) 1156:r 1153:= 1147:. 1141:( 1135:( 1132:x 1129:∀ 1123:0 1120:= 1114:. 1108:; 1102:← 1096:{ 1093:) 1090:r 1087:= 1081:. 1075:( 1069:( 1066:x 1063:∀ 1055:r 886:0 883:← 877:. 871:; 865:← 859:. 853:{ 847:} 844:; 838:. 832:← 826:. 820:{ 817:) 814:0 811:¦ 805:. 799:( 780:; 777:r 774:← 768:. 749:; 743:← 737:. 731:; 725:← 722:A 719:. 713:} 710:; 707:0 704:← 698:. 692:; 686:← 680:. 674:{ 668:} 665:; 659:. 653:← 647:. 641:{ 638:) 635:0 632:¦ 626:. 620:( 609:r 596:; 593:r 590:← 584:. 578:; 572:← 566:. 560:} 557:; 554:0 551:← 545:. 539:; 533:← 527:. 521:{ 515:} 512:; 506:. 500:← 494:. 488:{ 485:) 482:0 479:¦ 473:. 467:( 461:} 458:; 455:0 452:← 446:. 440:; 434:← 428:. 422:{ 416:} 410:. 404:← 398:. 392:{ 389:) 386:0 383:¦ 377:. 371:( 360:r 180:) 174:( 169:) 165:( 161:. 151:. 20:)

Index

Tomasulo
computer architecture
algorithm
out-of-order execution
Robert Tomasulo
IBM
IBM System/360 Model 91
floating point unit
register renaming
reservation stations
parallel execution
scoreboarding
Robert Tomasulo
Eckert–Mauchly Award

disputed
talk page
reliably sourced
Learn how and when to remove this message
exceptions
register renaming
in-order
Classic RISC pipeline § Exceptions

arithmetic logic unit
read-after-write
write-after-write
Write-after-Read
computer architecture
hazards

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.