Knowledge

ARM big.LITTLE

Source 📝

156: 235: 22: 188: 308:, which will decide where each process/thread is executed. This will be required for the non-paired arrangement but could possibly also be used on the paired cores. It poses unique problems for the kernel scheduler, which, at least with modern commodity hardware, has been able to assume all cores in a 130:
In practice, a big.LITTLE system can be surprisingly inflexible. One issue is the number and types of power and clock domains that the IC provides. These may not match the standard power management features offered by an operating system. Another is that the CPUs no longer have equivalent abilities,
351:
In May 2017, ARM announced DynamIQ as the successor to big.LITTLE. DynamIQ is expected to allow for more flexibility and scalability when designing multi-core processors. In contrast to big.LITTLE, it increases the maximum number of cores in a cluster to 8 for Armv8.2 CPUs, 12 for Armv9 and 14 for
300:
in Linux) will simply see a list of frequencies/voltages and will switch between them as it sees fit, just like it does on the existing hardware. However, the low-end slots will activate the 'Little' core and the high-end slots will activate the 'Big' core. This is the early solution provided by
352:
Armv9.2 and allows for varying core designs within a single cluster, and up to 32 total clusters. The technology also offers more fine grained per core voltage control and faster L2 cache speeds. However, DynamIQ is incompatible with previous ARM designs and is initially only supported by the
218:
A more complex arrangement involves a non-symmetric grouping of 'big' and 'LITTLE' cores. A single chip could have one or two 'big' cores and many more 'LITTLE' cores, or vice versa. Nvidia created something similar to this with the low-power 'companion core' in their
203:, and only one real core is (fully) powered up and running at a time. The 'big' core is used when the demand is high and the 'LITTLE' core is employed when demand is low. When demand on the virtual core changes (between high and low), the incoming core is powered up, 163:
The clustered model approach is the first and simplest implementation, arranging the processor into identically sized clusters of "big" or "LITTLE" cores. The operating system scheduler can only see one cluster at a time; when the
114:
logic, active power increases as the logic switches more per second, while leakage increases with the number of transistors. So, CPUs designed to run fast are different from CPUs designed to save power. When a very fast
257:
or computational intensity can in this case be allocated to the "big" cores while threads with less priority or less computational intensity, such as background tasks, can be performed by the "LITTLE" cores.
131:
and matching the right software task to the right CPU becomes more difficult. Most of these problems are being solved by making the electronics and software more flexible.
402: 119:
CPU is idling at very low speeds, a CPU with much less leakage (fewer transistors) could do the same work. For example, it might use a smaller (fewer transistors)
59:
alone. ARM's marketing material promises up to a 75% savings in power usage for some activities. Most commonly, ARM big.LITTLE architectures are used to create a
172:, the active core cluster is powered off and the other one is activated. A Cache Coherent Interconnect (CCI) is used. This model has been implemented in the 428: 917: 703: 376: 403:"ARM Unveils its Most Energy Efficient Application Processor Ever; Redefines Traditional Power And Performance Relationship With big.LITTLE Processing" 168:
on the whole processor changes between low and high, the system transitions to the other cluster. All relevant data are then passed through the common
517: 486: 892: 325:
Finer-grained control of workloads that are migrated between cores. Because the scheduler is directly migrating tasks between cores, kernel
922: 872: 569: 211:
framework. A complete big.LITTLE IKS implementation was added in Linux 3.11. big.LITTLE IKS is an improvement of cluster migration (
548: 102:
in February 2014. Both the Cortex-A12 and the Cortex-A17 can also be paired in a big.LITTLE configuration with the Cortex-A7.
607: 632: 342:
The ability to use all cores simultaneously to provide improved peak performance throughput of the SoC compared to IKS.
336:
Implementation in the scheduler also makes switching decisions faster than in the cpufreq framework implemented in IKS.
195:
CPU migration via the in-kernel switcher (IKS) involves pairing up a 'big' core with a 'LITTLE' core, with possibly
943: 254: 740: 139:
There are three ways for the different processor cores to be arranged in a big.LITTLE design, depending on the
60: 380: 339:
The ability to easily support non-symmetrical clusters (e.g. with 2 Cortex-A15 cores and 4 Cortex-A7 cores).
301:
Linux's "deadline" CPU scheduler (not to be confused with the I/O scheduler with the same name) since 2012.
525: 494: 71: 453: 330: 90:) cores, which are also intercompatible to allow their use in a big.LITTLE chip. ARM later announced the 658: 938: 721: 309: 293: 144: 289: 243: 207:, the outgoing is shut down, and processing continues on the new core. Switching is done via the 33: 124: 116: 305: 208: 140: 75: 36: 326: 67: 52: 582: 8: 127:. big.LITTLE is a way to optimize for both cases: Power and speed, in the same system. 250: 196: 429:"ARM Launches Cortex-A50 Series, the World's Most Energy-Efficient 64-bit Processors" 120: 633:"Samsung Unveils New Products from its System LSI Business at Mobile World Congress" 471: 285: 223: 165: 246: 95: 312:
system are equal rather than heterogeneous. A 2019 addition to Linux 5.0 called
357: 353: 269: 204: 155: 99: 91: 83: 79: 932: 636: 56: 284:
The paired arrangement allows for switching to be done transparently to the
896: 876: 763: 578: 490: 432: 406: 40: 454:"ARM's new Cortex-A12 is ready to power 2014's $ 200 midrange smartphones" 229: 55:
that can adjust better to dynamic computing needs and use less power than
234: 215:), the main difference being that each pair is visible to the scheduler. 611: 273: 249:(HMP), which enables the use of all physical cores at the same time. 220: 187: 852: 832: 812: 792: 684: 169: 472:"ARM Cortex A17: An Evolved Cortex A12 for the Mainstream in 2015" 199:
identical pairs in one chip. Each pair operates as one so-termed
21: 856: 836: 816: 796: 262: 173: 43:, coupling relatively battery-saving and slower processor cores ( 521: 316:
is an example of a scheduler that considers cores differently.
296:(DVFS) facility. The existing DVFS support in the kernel (e.g. 268:
starting with the Exynos 5 Octa series (5420, 5422, 5430), and
265: 176: 87: 893:"ARM goes 64-bit with new Cortex-A53 and Cortex-A57 designs" 873:"ARM's new Cortex A7 is tailor-made for Android superphones" 923:
big.LITTLE Processing with ARM CortexTM-A15 & Cortex-A7
111: 242:
The most powerful use model of big.LITTLE architecture is
571:
Big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7
66:
In October 2011, big.LITTLE was announced along with the
608:"Samsung Announces big.LITTLE MP Support in Exynos 5420" 319: 230:
Heterogeneous multi-processing (global task scheduling)
47:) with relatively more powerful and power-hungry ones ( 182: 105: 304:Alternatively, all the cores may be exposed to the 421: 123:, or a simpler microarchitecture such as removing 890: 930: 625: 830: 810: 704:"Energy Aware Scheduling merged in Linux 5.0" 790: 768:Arm | The Architecture for the Digital World 549:"Benchmarking ARM's big-little architecture" 546: 515: 870: 605: 599: 484: 850: 333:savings can be correspondingly increased. 238:Big.Little heterogeneous multi-processing 25:Cortex A57/A53 MPCore big.LITTLE CPU chip 682: 233: 212: 186: 154: 20: 738: 395: 261:This model has been implemented in the 931: 741:"Exploring Dynamiq and ARM's New CPUs" 701: 369: 150: 134: 891:Andrew Cunningham (30 October 2012). 487:"Ten Things to Know About big.LITTLE" 702:Perret, Quentin (25 February 2019). 320:Advantages of global task scheduling 78:. In October 2012 ARM announced the 16:Heterogeneous computing architecture 793:"big.LITTLE MP status Jan 25, 2013" 13: 925:(PDF) (full technical explanation) 853:"KS2012: ARM: A big.LITTLE update" 813:"Linux support for ARM big.LITTLE" 811:Nicolas Pitre (15 February 2012). 784: 635:. Samsung Tomorrow. Archived from 183:In-kernel switcher (CPU migration) 106:The problem that big.LITTLE solves 14: 955: 911: 791:David Zinman (25 January 2013). 606:Brian Klug (11 September 2013). 581:, September 2013, archived from 360:CPU cores and their successors. 51:). The intention is to create a 833:"A big.LITTLE scheduler update" 761: 755: 732: 714: 695: 685:"A big.LITTLE scheduler update" 683:McKenney, Paul (12 June 2012). 676: 651: 871:Jon Stokes (20 October 2011). 851:Jake Edge (5 September 2012). 831:Paul McKenney (12 June 2012). 726:The Linux Kernel documentation 659:"The future is here: iPhone X" 562: 547:Peter Clarke (6 August 2013). 540: 509: 478: 464: 446: 159:Big.Little clustered switching 61:multi-processor system-on-chip 1: 739:Humrick, Matt (29 May 2017). 363: 279: 272:processors starting with the 191:Big.Little in-kernel switcher 518:"big.LITTLE Software Update" 516:George Grey (10 July 2013). 346: 205:running state is transferred 7: 485:Brian Jeff (18 June 2013). 70:, which was designed to be 10: 960: 213:§ Clustered switching 722:"Energy Aware Scheduling" 379:. ARM.com. Archived from 474:. AnandTech. April 2014. 944:Heterogeneous computing 377:"big.LITTLE technology" 314:Energy Aware Scheduling 110:For a given library of 34:heterogeneous computing 239: 192: 160: 125:out-of-order execution 26: 918:big.LITTLE Processing 237: 190: 158: 24: 497:on 10 September 2013 74:compatible with the 53:multi-core processor 288:using the existing 151:Clustered switching 143:implemented in the 135:Run-state migration 383:on 22 October 2012 240: 193: 161: 27: 708:community.arm.com 528:on 4 October 2013 431:(Press release). 409:. 19 October 2011 405:(Press release). 294:frequency scaling 951: 939:ARM architecture 907: 905: 903: 887: 885: 883: 867: 865: 863: 847: 845: 843: 827: 825: 823: 807: 805: 803: 779: 778: 776: 774: 764:"DynamIQ – Arm®" 759: 753: 752: 750: 748: 736: 730: 729: 718: 712: 711: 699: 693: 692: 680: 674: 673: 671: 669: 655: 649: 648: 646: 644: 639:on 16 March 2014 629: 623: 622: 620: 618: 603: 597: 596: 595: 593: 588:on 17 April 2012 587: 576: 566: 560: 559: 557: 555: 544: 538: 537: 535: 533: 524:. Archived from 513: 507: 506: 504: 502: 493:. Archived from 482: 476: 475: 468: 462: 461: 450: 444: 443: 441: 439: 425: 419: 418: 416: 414: 399: 393: 392: 390: 388: 373: 306:kernel scheduler 299: 286:operating system 247:multi-processing 98:followed by the 959: 958: 954: 953: 952: 950: 949: 948: 929: 928: 914: 901: 899: 881: 879: 861: 859: 841: 839: 821: 819: 801: 799: 787: 785:Further reading 782: 772: 770: 760: 756: 746: 744: 737: 733: 720: 719: 715: 700: 696: 681: 677: 667: 665: 657: 656: 652: 642: 640: 631: 630: 626: 616: 614: 604: 600: 591: 589: 585: 574: 568: 567: 563: 553: 551: 545: 541: 531: 529: 514: 510: 500: 498: 483: 479: 470: 469: 465: 452: 451: 447: 437: 435: 427: 426: 422: 412: 410: 401: 400: 396: 386: 384: 375: 374: 370: 366: 349: 329:is reduced and 322: 297: 290:dynamic voltage 282: 232: 185: 179:5 Octa (5410). 153: 137: 108: 72:architecturally 17: 12: 11: 5: 957: 947: 946: 941: 927: 926: 920: 913: 912:External links 910: 909: 908: 888: 868: 848: 828: 808: 786: 783: 781: 780: 754: 731: 713: 694: 675: 663:Apple Newsroom 650: 624: 598: 561: 539: 508: 477: 463: 445: 420: 394: 367: 365: 362: 348: 345: 344: 343: 340: 337: 334: 321: 318: 281: 278: 270:Apple A series 231: 228: 224:System-on-Chip 184: 181: 152: 149: 136: 133: 107: 104: 30:ARM big.LITTLE 15: 9: 6: 4: 3: 2: 956: 945: 942: 940: 937: 936: 934: 924: 921: 919: 916: 915: 898: 894: 889: 878: 874: 869: 858: 854: 849: 838: 834: 829: 818: 814: 809: 798: 794: 789: 788: 769: 765: 758: 742: 735: 727: 723: 717: 709: 705: 698: 690: 686: 679: 664: 660: 654: 638: 634: 628: 613: 609: 602: 584: 580: 573: 572: 565: 550: 543: 527: 523: 519: 512: 496: 492: 488: 481: 473: 467: 460:. April 2014. 459: 455: 449: 434: 430: 424: 408: 404: 398: 382: 378: 372: 368: 361: 359: 355: 341: 338: 335: 332: 328: 324: 323: 317: 315: 311: 307: 302: 295: 291: 287: 277: 275: 271: 267: 264: 259: 256: 255:high priority 252: 248: 245: 244:heterogeneous 236: 227: 225: 222: 216: 214: 210: 206: 202: 198: 189: 180: 178: 175: 171: 167: 157: 148: 146: 142: 132: 128: 126: 122: 118: 113: 103: 101: 97: 96:Computex 2013 93: 89: 85: 81: 77: 73: 69: 64: 62: 58: 57:clock scaling 54: 50: 46: 42: 39:developed by 38: 35: 31: 23: 19: 900:. Retrieved 897:Ars Technica 880:. Retrieved 877:Ars Technica 860:. Retrieved 840:. Retrieved 820:. Retrieved 800:. Retrieved 771:. Retrieved 767: 757: 745:. Retrieved 734: 725: 716: 707: 697: 688: 678: 666:. Retrieved 662: 653: 641:. Retrieved 637:the original 627: 617:16 September 615:. Retrieved 601: 592:17 September 590:, retrieved 583:the original 579:ARM Holdings 570: 564: 554:17 September 552:. Retrieved 542: 532:17 September 530:. Retrieved 526:the original 511: 501:17 September 499:. Retrieved 495:the original 491:ARM Holdings 480: 466: 457: 448: 436:. Retrieved 433:ARM Holdings 423: 411:. Retrieved 407:ARM Holdings 397: 385:. Retrieved 381:the original 371: 350: 313: 303: 283: 260: 241: 217: 201:virtual core 200: 194: 162: 138: 129: 121:memory cache 117:out-of-order 109: 65: 48: 44: 41:Arm Holdings 37:architecture 29: 28: 18: 743:. Anandtech 668:25 February 643:26 February 933:Categories 902:31 October 882:31 October 862:18 October 842:18 October 822:18 October 802:25 January 773:18 October 762:Ltd, Arm. 438:31 October 413:31 October 387:17 October 364:References 358:Cortex-A55 354:Cortex-A75 280:Scheduling 100:Cortex-A17 92:Cortex-A12 84:Cortex-A57 80:Cortex-A53 76:Cortex-A15 612:AnandTech 458:The Verge 347:Successor 274:Apple A11 141:scheduler 68:Cortex-A7 63:(MPSoC). 327:overhead 170:L2 cache 857:LWN.net 837:LWN.net 817:LWN.net 797:LWN.net 747:10 July 689:LWN.net 298:cpufreq 263:Samsung 251:Threads 221:Tegra 3 209:cpufreq 174:Samsung 88:ARMv8-A 522:Linaro 266:Exynos 177:Exynos 145:kernel 45:LITTLE 586:(PDF) 575:(PDF) 331:power 253:with 32:is a 904:2012 884:2012 864:2012 844:2012 824:2012 804:2013 775:2023 749:2017 670:2018 645:2013 619:2013 594:2013 556:2013 534:2013 503:2013 440:2012 415:2012 389:2012 356:and 292:and 197:many 166:load 112:CMOS 82:and 310:SMP 94:at 49:big 935:: 895:. 875:. 855:. 835:. 815:. 795:. 766:. 724:. 706:. 687:. 661:. 610:. 577:, 520:. 489:. 456:. 276:. 226:. 147:. 906:. 886:. 866:. 846:. 826:. 806:. 777:. 751:. 728:. 710:. 691:. 672:. 647:. 621:. 558:. 536:. 505:. 442:. 417:. 391:. 86:(

Index


heterogeneous computing
architecture
Arm Holdings
multi-core processor
clock scaling
multi-processor system-on-chip
Cortex-A7
architecturally
Cortex-A15
Cortex-A53
Cortex-A57
ARMv8-A
Cortex-A12
Computex 2013
Cortex-A17
CMOS
out-of-order
memory cache
out-of-order execution
scheduler
kernel

load
L2 cache
Samsung
Exynos

many
running state is transferred

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.