Knowledge

ARM big.LITTLE

Source 📝

167: 246: 33: 199: 319:, which will decide where each process/thread is executed. This will be required for the non-paired arrangement but could possibly also be used on the paired cores. It poses unique problems for the kernel scheduler, which, at least with modern commodity hardware, has been able to assume all cores in a 141:
In practice, a big.LITTLE system can be surprisingly inflexible. One issue is the number and types of power and clock domains that the IC provides. These may not match the standard power management features offered by an operating system. Another is that the CPUs no longer have equivalent abilities,
362:
In May 2017, ARM announced DynamIQ as the successor to big.LITTLE. DynamIQ is expected to allow for more flexibility and scalability when designing multi-core processors. In contrast to big.LITTLE, it increases the maximum number of cores in a cluster to 8 for Armv8.2 CPUs, 12 for Armv9 and 14 for
311:
in Linux) will simply see a list of frequencies/voltages and will switch between them as it sees fit, just like it does on the existing hardware. However, the low-end slots will activate the 'Little' core and the high-end slots will activate the 'Big' core. This is the early solution provided by
363:
Armv9.2 and allows for varying core designs within a single cluster, and up to 32 total clusters. The technology also offers more fine grained per core voltage control and faster L2 cache speeds. However, DynamIQ is incompatible with previous ARM designs and is initially only supported by the
229:
A more complex arrangement involves a non-symmetric grouping of 'big' and 'LITTLE' cores. A single chip could have one or two 'big' cores and many more 'LITTLE' cores, or vice versa. Nvidia created something similar to this with the low-power 'companion core' in their
214:, and only one real core is (fully) powered up and running at a time. The 'big' core is used when the demand is high and the 'LITTLE' core is employed when demand is low. When demand on the virtual core changes (between high and low), the incoming core is powered up, 174:
The clustered model approach is the first and simplest implementation, arranging the processor into identically sized clusters of "big" or "LITTLE" cores. The operating system scheduler can only see one cluster at a time; when the
125:
logic, active power increases as the logic switches more per second, while leakage increases with the number of transistors. So, CPUs designed to run fast are different from CPUs designed to save power. When a very fast
268:
or computational intensity can in this case be allocated to the "big" cores while threads with less priority or less computational intensity, such as background tasks, can be performed by the "LITTLE" cores.
142:
and matching the right software task to the right CPU becomes more difficult. Most of these problems are being solved by making the electronics and software more flexible.
413: 130:
CPU is idling at very low speeds, a CPU with much less leakage (fewer transistors) could do the same work. For example, it might use a smaller (fewer transistors)
70:
alone. ARM's marketing material promises up to a 75% savings in power usage for some activities. Most commonly, ARM big.LITTLE architectures are used to create a
183:, the active core cluster is powered off and the other one is activated. A Cache Coherent Interconnect (CCI) is used. This model has been implemented in the 439: 928: 714: 387: 414:"ARM Unveils its Most Energy Efficient Application Processor Ever; Redefines Traditional Power And Performance Relationship With big.LITTLE Processing" 179:
on the whole processor changes between low and high, the system transitions to the other cluster. All relevant data are then passed through the common
528: 497: 903: 336:
Finer-grained control of workloads that are migrated between cores. Because the scheduler is directly migrating tasks between cores, kernel
933: 883: 580: 222:
framework. A complete big.LITTLE IKS implementation was added in Linux 3.11. big.LITTLE IKS is an improvement of cluster migration (
559: 113:
in February 2014. Both the Cortex-A12 and the Cortex-A17 can also be paired in a big.LITTLE configuration with the Cortex-A7.
618: 643: 353:
The ability to use all cores simultaneously to provide improved peak performance throughput of the SoC compared to IKS.
347:
Implementation in the scheduler also makes switching decisions faster than in the cpufreq framework implemented in IKS.
206:
CPU migration via the in-kernel switcher (IKS) involves pairing up a 'big' core with a 'LITTLE' core, with possibly
954: 265: 751: 150:
There are three ways for the different processor cores to be arranged in a big.LITTLE design, depending on the
71: 391: 350:
The ability to easily support non-symmetrical clusters (e.g. with 2 Cortex-A15 cores and 4 Cortex-A7 cores).
312:
Linux's "deadline" CPU scheduler (not to be confused with the I/O scheduler with the same name) since 2012.
536: 505: 82: 464: 341: 101:) cores, which are also intercompatible to allow their use in a big.LITTLE chip. ARM later announced the 17: 669: 949: 732: 320: 304: 155: 300: 254: 218:, the outgoing is shut down, and processing continues on the new core. Switching is done via the 44: 135: 127: 316: 219: 151: 86: 47: 337: 78: 63: 593: 8: 138:. big.LITTLE is a way to optimize for both cases: Power and speed, in the same system. 261: 207: 440:"ARM Launches Cortex-A50 Series, the World's Most Energy-Efficient 64-bit Processors" 131: 644:"Samsung Unveils New Products from its System LSI Business at Mobile World Congress" 482: 296: 234: 176: 257: 106: 323:
system are equal rather than heterogeneous. A 2019 addition to Linux 5.0 called
368: 364: 280: 215: 166: 110: 102: 94: 90: 943: 647: 67: 295:
The paired arrangement allows for switching to be done transparently to the
907: 887: 774: 589: 501: 443: 417: 51: 465:"ARM's new Cortex-A12 is ready to power 2014's $ 200 midrange smartphones" 240: 66:
that can adjust better to dynamic computing needs and use less power than
245: 226:), the main difference being that each pair is visible to the scheduler. 622: 284: 260:(HMP), which enables the use of all physical cores at the same time. 231: 198: 863: 843: 823: 803: 695: 180: 483:"ARM Cortex A17: An Evolved Cortex A12 for the Mainstream in 2015" 210:
identical pairs in one chip. Each pair operates as one so-termed
32: 867: 847: 827: 807: 273: 184: 54:, coupling relatively battery-saving and slower processor cores ( 532: 327:
is an example of a scheduler that considers cores differently.
307:(DVFS) facility. The existing DVFS support in the kernel (e.g. 279:
starting with the Exynos 5 Octa series (5420, 5422, 5430), and
276: 187: 98: 904:"ARM goes 64-bit with new Cortex-A53 and Cortex-A57 designs" 884:"ARM's new Cortex A7 is tailor-made for Android superphones" 934:
big.LITTLE Processing with ARM CortexTM-A15 & Cortex-A7
122: 253:
The most powerful use model of big.LITTLE architecture is
582:
Big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7
77:
In October 2011, big.LITTLE was announced along with the
619:"Samsung Announces big.LITTLE MP Support in Exynos 5420" 330: 241:
Heterogeneous multi-processing (global task scheduling)
58:) with relatively more powerful and power-hungry ones ( 193: 116: 315:Alternatively, all the cores may be exposed to the 432: 134:, or a simpler microarchitecture such as removing 901: 941: 636: 841: 821: 715:"Energy Aware Scheduling merged in Linux 5.0" 801: 779:Arm | The Architecture for the Digital World 560:"Benchmarking ARM's big-little architecture" 557: 526: 881: 616: 610: 495: 861: 344:savings can be correspondingly increased. 249:Big.Little heterogeneous multi-processing 36:Cortex A57/A53 MPCore big.LITTLE CPU chip 693: 244: 223: 197: 165: 31: 749: 406: 272:This model has been implemented in the 14: 942: 752:"Exploring Dynamiq and ARM's New CPUs" 712: 380: 161: 145: 902:Andrew Cunningham (30 October 2012). 498:"Ten Things to Know About big.LITTLE" 713:Perret, Quentin (25 February 2019). 331:Advantages of global task scheduling 89:. In October 2012 ARM announced the 27:Heterogeneous computing architecture 804:"big.LITTLE MP status Jan 25, 2013" 24: 936:(PDF) (full technical explanation) 864:"KS2012: ARM: A big.LITTLE update" 824:"Linux support for ARM big.LITTLE" 822:Nicolas Pitre (15 February 2012). 795: 646:. Samsung Tomorrow. Archived from 194:In-kernel switcher (CPU migration) 117:The problem that big.LITTLE solves 25: 966: 922: 802:David Zinman (25 January 2013). 617:Brian Klug (11 September 2013). 592:, September 2013, archived from 371:CPU cores and their successors. 62:). The intention is to create a 844:"A big.LITTLE scheduler update" 772: 766: 743: 725: 706: 696:"A big.LITTLE scheduler update" 694:McKenney, Paul (12 June 2012). 687: 662: 882:Jon Stokes (20 October 2011). 862:Jake Edge (5 September 2012). 842:Paul McKenney (12 June 2012). 737:The Linux Kernel documentation 670:"The future is here: iPhone X" 573: 558:Peter Clarke (6 August 2013). 551: 520: 489: 475: 457: 170:Big.Little clustered switching 72:multi-processor system-on-chip 13: 1: 750:Humrick, Matt (29 May 2017). 374: 290: 283:processors starting with the 202:Big.Little in-kernel switcher 529:"big.LITTLE Software Update" 527:George Grey (10 July 2013). 357: 216:running state is transferred 7: 496:Brian Jeff (18 June 2013). 81:, which was designed to be 10: 971: 224:§ Clustered switching 733:"Energy Aware Scheduling" 390:. ARM.com. Archived from 485:. AnandTech. April 2014. 955:Heterogeneous computing 388:"big.LITTLE technology" 325:Energy Aware Scheduling 121:For a given library of 45:heterogeneous computing 250: 203: 171: 136:out-of-order execution 37: 929:big.LITTLE Processing 248: 201: 169: 35: 508:on 10 September 2013 85:compatible with the 64:multi-core processor 299:using the existing 162:Clustered switching 154:implemented in the 146:Run-state migration 394:on 22 October 2012 251: 204: 172: 38: 719:community.arm.com 539:on 4 October 2013 442:(Press release). 420:. 19 October 2011 416:(Press release). 305:frequency scaling 16:(Redirected from 962: 950:ARM architecture 918: 916: 914: 898: 896: 894: 878: 876: 874: 858: 856: 854: 838: 836: 834: 818: 816: 814: 790: 789: 787: 785: 775:"DynamIQ – Arm®" 770: 764: 763: 761: 759: 747: 741: 740: 729: 723: 722: 710: 704: 703: 691: 685: 684: 682: 680: 666: 660: 659: 657: 655: 650:on 16 March 2014 640: 634: 633: 631: 629: 614: 608: 607: 606: 604: 599:on 17 April 2012 598: 587: 577: 571: 570: 568: 566: 555: 549: 548: 546: 544: 535:. Archived from 524: 518: 517: 515: 513: 504:. Archived from 493: 487: 486: 479: 473: 472: 461: 455: 454: 452: 450: 436: 430: 429: 427: 425: 410: 404: 403: 401: 399: 384: 317:kernel scheduler 310: 297:operating system 258:multi-processing 109:followed by the 21: 970: 969: 965: 964: 963: 961: 960: 959: 940: 939: 925: 912: 910: 892: 890: 872: 870: 852: 850: 832: 830: 812: 810: 798: 796:Further reading 793: 783: 781: 771: 767: 757: 755: 748: 744: 731: 730: 726: 711: 707: 692: 688: 678: 676: 668: 667: 663: 653: 651: 642: 641: 637: 627: 625: 615: 611: 602: 600: 596: 585: 579: 578: 574: 564: 562: 556: 552: 542: 540: 525: 521: 511: 509: 494: 490: 481: 480: 476: 463: 462: 458: 448: 446: 438: 437: 433: 423: 421: 412: 411: 407: 397: 395: 386: 385: 381: 377: 360: 340:is reduced and 333: 308: 301:dynamic voltage 293: 243: 196: 190:5 Octa (5410). 164: 148: 119: 83:architecturally 28: 23: 22: 15: 12: 11: 5: 968: 958: 957: 952: 938: 937: 931: 924: 923:External links 921: 920: 919: 899: 879: 859: 839: 819: 797: 794: 792: 791: 765: 742: 724: 705: 686: 674:Apple Newsroom 661: 635: 609: 572: 550: 519: 488: 474: 456: 431: 405: 378: 376: 373: 359: 356: 355: 354: 351: 348: 345: 332: 329: 292: 289: 281:Apple A series 242: 239: 235:System-on-Chip 195: 192: 163: 160: 147: 144: 118: 115: 41:ARM big.LITTLE 26: 9: 6: 4: 3: 2: 967: 956: 953: 951: 948: 947: 945: 935: 932: 930: 927: 926: 909: 905: 900: 889: 885: 880: 869: 865: 860: 849: 845: 840: 829: 825: 820: 809: 805: 800: 799: 780: 776: 769: 753: 746: 738: 734: 728: 720: 716: 709: 701: 697: 690: 675: 671: 665: 649: 645: 639: 624: 620: 613: 595: 591: 584: 583: 576: 561: 554: 538: 534: 530: 523: 507: 503: 499: 492: 484: 478: 471:. April 2014. 470: 466: 460: 445: 441: 435: 419: 415: 409: 393: 389: 383: 379: 372: 370: 366: 352: 349: 346: 343: 339: 335: 334: 328: 326: 322: 318: 313: 306: 302: 298: 288: 286: 282: 278: 275: 270: 267: 266:high priority 263: 259: 256: 255:heterogeneous 247: 238: 236: 233: 227: 225: 221: 217: 213: 209: 200: 191: 189: 186: 182: 178: 168: 159: 157: 153: 143: 139: 137: 133: 129: 124: 114: 112: 108: 107:Computex 2013 104: 100: 96: 92: 88: 84: 80: 75: 73: 69: 68:clock scaling 65: 61: 57: 53: 50:developed by 49: 46: 42: 34: 30: 19: 911:. Retrieved 908:Ars Technica 891:. Retrieved 888:Ars Technica 871:. Retrieved 851:. Retrieved 831:. Retrieved 811:. Retrieved 782:. Retrieved 778: 768: 756:. Retrieved 745: 736: 727: 718: 708: 699: 689: 677:. Retrieved 673: 664: 652:. Retrieved 648:the original 638: 628:16 September 626:. Retrieved 612: 603:17 September 601:, retrieved 594:the original 590:ARM Holdings 581: 575: 565:17 September 563:. Retrieved 553: 543:17 September 541:. Retrieved 537:the original 522: 512:17 September 510:. Retrieved 506:the original 502:ARM Holdings 491: 477: 468: 459: 447:. Retrieved 444:ARM Holdings 434: 422:. Retrieved 418:ARM Holdings 408: 396:. Retrieved 392:the original 382: 361: 324: 314: 294: 271: 252: 228: 212:virtual core 211: 205: 173: 149: 140: 132:memory cache 128:out-of-order 120: 76: 59: 55: 52:Arm Holdings 48:architecture 40: 39: 29: 754:. Anandtech 679:25 February 654:26 February 944:Categories 913:31 October 893:31 October 873:18 October 853:18 October 833:18 October 813:25 January 784:18 October 773:Ltd, Arm. 449:31 October 424:31 October 398:17 October 375:References 369:Cortex-A55 365:Cortex-A75 291:Scheduling 111:Cortex-A17 103:Cortex-A12 95:Cortex-A57 91:Cortex-A53 87:Cortex-A15 18:Big.LITTLE 623:AnandTech 469:The Verge 358:Successor 285:Apple A11 152:scheduler 79:Cortex-A7 74:(MPSoC). 338:overhead 181:L2 cache 868:LWN.net 848:LWN.net 828:LWN.net 808:LWN.net 758:10 July 700:LWN.net 309:cpufreq 274:Samsung 262:Threads 232:Tegra 3 220:cpufreq 185:Samsung 99:ARMv8-A 533:Linaro 277:Exynos 188:Exynos 156:kernel 56:LITTLE 597:(PDF) 586:(PDF) 342:power 264:with 43:is a 915:2012 895:2012 875:2012 855:2012 835:2012 815:2013 786:2023 760:2017 681:2018 656:2013 630:2013 605:2013 567:2013 545:2013 514:2013 451:2012 426:2012 400:2012 367:and 303:and 208:many 177:load 123:CMOS 93:and 321:SMP 105:at 60:big 946:: 906:. 886:. 866:. 846:. 826:. 806:. 777:. 735:. 717:. 698:. 672:. 621:. 588:, 531:. 500:. 467:. 287:. 237:. 158:. 917:. 897:. 877:. 857:. 837:. 817:. 788:. 762:. 739:. 721:. 702:. 683:. 658:. 632:. 569:. 547:. 516:. 453:. 428:. 402:. 97:( 20:)

Index

Big.LITTLE

heterogeneous computing
architecture
Arm Holdings
multi-core processor
clock scaling
multi-processor system-on-chip
Cortex-A7
architecturally
Cortex-A15
Cortex-A53
Cortex-A57
ARMv8-A
Cortex-A12
Computex 2013
Cortex-A17
CMOS
out-of-order
memory cache
out-of-order execution
scheduler
kernel

load
L2 cache
Samsung
Exynos

many

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.