Knowledge

Sawzall (programming language)

Source 📝

25: 245:
et al. developed the Sawzall language. A Sawzall script runs within the Map phase of a MapReduce and "emits" values to tables. Then the Reduce phase (which the script writer does not have to be concerned about) aggregates the tables from multiple runs into a single set of tables.
240:
programs in C++ or Java. MapReduce programs need to be compiled and may be more verbose than necessary, so writing a program to analyze the logs can be time-consuming. To make it easier to write quick scripts,
216:
table aggregators have not been released, the open-sourced runtime is not useful for large-scale data analysis of multiple log files off the shelf. Sawzall has been replaced by Lingo (logs in
249:
Currently, only the language runtime (which runs a Sawzall script once over a single input) has been open-sourced; the supporting program built on MapReduce has not been released.
359:
count: table sum of int; total: table sum of float; sum_of_squares: table sum of float; x: float = input; emit count <- 1; emit total <- x; emit sum_of_squares <- x * x;
460:
S. Ghemawat, H. Gobioff, S.-T. Leung, The Google file system, in: 19th ACM Symposium on Operating Systems Principles, Proceedings, 17 ACM Press, 2003, pp. 29–43.
356:
This complete Sawzall program will read the input and produce three results: the number of records, the sum of the values, and the sum of the squares of the values.
397: 287:
In addition, there are several statistical table types that give inexact results. The higher the parameter n, the more accurate the estimates are.
475: 42: 1015: 1000: 332:
lists, maps, and structs. However, there are no references or pointers. All assignments and function arguments create copies. This means that
89: 61: 261:
A Sawzall script has a single input (a log record) and can output only by emitting to tables. The script can have no other side-effects.
1020: 1005: 68: 436: 75: 681: 57: 968: 502: 212:
records. Sawzall was first described in 2003, and the szl runtime was open-sourced in August 2010. However, since the
108: 621: 423: 974: 46: 743: 551: 82: 922: 748: 614: 932: 530: 753: 674: 561: 217: 1010: 776: 582: 333: 198: 781: 717: 35: 906: 712: 592: 495: 368: 979: 525: 881: 806: 727: 667: 201: 8: 821: 659: 577: 377: 329: 233: 162: 127: 891: 831: 587: 488: 896: 826: 796: 229: 157: 340: 638: 962: 643: 344: 597: 994: 372: 856: 836: 236:. In order to perform calculations involving the logs, engineers can write 816: 791: 470: 305:
gives n values that are probably the most frequent of the emitted values.
173: 471:
Google Code Archive - Long-term storage for Google Code Project Hosting.
927: 901: 866: 722: 299:
calculates a cumulative probability distribution of the given numbers.
264:
A script can define any number of output tables. Table types include:
786: 546: 317:
Sawzall's design favors efficiency and engine simplicity over power:
237: 213: 24: 942: 886: 511: 321:
Sawzall is statically typed, and the engine compiles the script to
242: 209: 871: 851: 811: 801: 408: 228:
Google's server logs are stored as large collections of records (
937: 876: 861: 690: 205: 132: 648: 293:
gives a random sample of n values from all the emitted values
846: 841: 480: 322: 689: 396:
Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan.
398:
Interpreting the Data: Parallel Analysis with Sawzall
424:
Discussion on which parts of Sawzall are open-source
419: 417: 49:. Unsourced material may be challenged and removed. 282:saves only the highest n values on a given weight. 992: 414: 311:estimates the number of unique values emitted. 232:) that are partitioned over many disks within 675: 496: 409:Sawzall's open source project at Google Code 682: 668: 503: 489: 371:– similar tool and language for use with 109:Learn how and when to remove this message 208:to process large numbers of individual 993: 1016:Programming languages created in 2003 1001:Domain-specific programming languages 663: 484: 276:saves the sum of every emitted value 47:adding citations to reliable sources 18: 257:Some interesting features include: 220:) for most purposes within Google. 58:"Sawzall" programming language 13: 969:Google LLC v. Oracle America, Inc. 454: 14: 1032: 1021:Software using the Apache license 464: 1006:Procedural programming languages 622:The Unix Programming Environment 23: 975:Open Source Security Foundation 351: 34:needs additional citations for 429: 402: 390: 1: 693:free and open-source software 383: 339:Like C, functions can modify 223: 510: 7: 615:The Practice of Programming 362: 252: 144:; 21 years ago 10: 1037: 336:and cycles are impossible. 955: 915: 767: 736: 705: 698: 631: 606: 570: 539: 518: 334:recursive data structures 270:saves every value emitted 168: 156: 138: 126: 737:Programming languages 540:Programming languages 526:Plan 9 from Bell Labs 347:but are not closures. 328:Sawzall supports the 202:programming language 43:improve this article 16:Programming language 437:"Replacing Sawzall" 330:compound data types 139:First appeared 123: 378:Sawmill (software) 325:before running it. 163:Apache License 2.0 121: 988: 987: 951: 950: 916:Operating systems 770:development tools 657: 656: 519:Operating systems 192: 191: 119: 118: 111: 93: 1028: 897:Protocol Buffers 703: 702: 684: 677: 670: 661: 660: 505: 498: 491: 482: 481: 448: 447: 445: 444: 433: 427: 421: 412: 406: 400: 394: 341:global variables 310: 304: 298: 292: 281: 275: 269: 230:Protocol Buffers 197:is a procedural 188: 185: 183: 181: 179: 177: 175: 152: 150: 145: 124: 120: 114: 107: 103: 100: 94: 92: 51: 27: 19: 1036: 1035: 1031: 1030: 1029: 1027: 1026: 1025: 1011:Google software 991: 990: 989: 984: 947: 911: 769: 763: 732: 694: 688: 658: 653: 627: 602: 566: 535: 514: 509: 467: 457: 455:Further reading 452: 451: 442: 440: 435: 434: 430: 422: 415: 407: 403: 395: 391: 386: 365: 360: 354: 345:local variables 308: 302: 296: 290: 279: 273: 267: 255: 226: 199:domain-specific 172: 148: 146: 143: 115: 104: 98: 95: 52: 50: 40: 28: 17: 12: 11: 5: 1034: 1024: 1023: 1018: 1013: 1008: 1003: 986: 985: 983: 982: 980:Summer of Code 977: 972: 965: 959: 957: 953: 952: 949: 948: 946: 945: 940: 935: 930: 925: 919: 917: 913: 912: 910: 909: 904: 899: 894: 889: 884: 879: 874: 869: 864: 859: 854: 849: 844: 839: 834: 829: 824: 819: 814: 809: 804: 799: 794: 789: 784: 779: 773: 771: 768:Frameworks and 765: 764: 762: 761: 756: 751: 746: 740: 738: 734: 733: 731: 730: 725: 720: 715: 709: 707: 700: 696: 695: 687: 686: 679: 672: 664: 655: 654: 652: 651: 646: 644:Mark V. Shaney 641: 635: 633: 629: 628: 626: 625: 618: 610: 608: 604: 603: 601: 600: 595: 590: 585: 580: 574: 572: 568: 567: 565: 564: 559: 554: 549: 543: 541: 537: 536: 534: 533: 528: 522: 520: 516: 515: 508: 507: 500: 493: 485: 479: 478: 473: 466: 465:External links 463: 462: 461: 456: 453: 450: 449: 428: 413: 401: 388: 387: 385: 382: 381: 380: 375: 364: 361: 358: 353: 350: 349: 348: 337: 326: 315: 314: 313: 312: 306: 300: 294: 285: 284: 283: 277: 271: 262: 254: 251: 225: 222: 190: 189: 170: 166: 165: 160: 154: 153: 140: 136: 135: 130: 117: 116: 31: 29: 22: 15: 9: 6: 4: 3: 2: 1033: 1022: 1019: 1017: 1014: 1012: 1009: 1007: 1004: 1002: 999: 998: 996: 981: 978: 976: 973: 971: 970: 966: 964: 961: 960: 958: 954: 944: 941: 939: 936: 934: 931: 929: 926: 924: 921: 920: 918: 914: 908: 905: 903: 900: 898: 895: 893: 890: 888: 885: 883: 880: 878: 875: 873: 870: 868: 865: 863: 860: 858: 855: 853: 850: 848: 845: 843: 840: 838: 835: 833: 830: 828: 825: 823: 820: 818: 815: 813: 810: 808: 807:Closure Tools 805: 803: 800: 798: 795: 793: 790: 788: 785: 783: 780: 778: 775: 774: 772: 766: 760: 757: 755: 752: 750: 747: 745: 742: 741: 739: 735: 729: 726: 724: 721: 719: 716: 714: 711: 710: 708: 704: 701: 697: 692: 685: 680: 678: 673: 671: 666: 665: 662: 650: 647: 645: 642: 640: 637: 636: 634: 630: 624: 623: 619: 617: 616: 612: 611: 609: 605: 599: 596: 594: 591: 589: 586: 584: 581: 579: 576: 575: 573: 569: 563: 560: 558: 555: 553: 550: 548: 545: 544: 542: 538: 532: 529: 527: 524: 523: 521: 517: 513: 506: 501: 499: 494: 492: 487: 486: 483: 477: 474: 472: 469: 468: 459: 458: 438: 432: 425: 420: 418: 410: 405: 399: 393: 389: 379: 376: 374: 373:Apache Hadoop 370: 367: 366: 357: 346: 342: 338: 335: 331: 327: 324: 320: 319: 318: 307: 301: 295: 289: 288: 286: 278: 272: 266: 265: 263: 260: 259: 258: 250: 247: 244: 239: 235: 231: 221: 219: 215: 211: 207: 203: 200: 196: 187: 171: 167: 164: 161: 159: 155: 141: 137: 134: 131: 129: 125: 113: 110: 102: 91: 88: 84: 81: 77: 74: 70: 67: 63: 60: –  59: 55: 54:Find sources: 48: 44: 38: 37: 32:This article 30: 26: 21: 20: 967: 758: 706:Applications 639:Renée French 620: 613: 607:Publications 556: 441:. Retrieved 439:. 2015-12-04 431: 404: 392: 355: 352:Sawzall code 316: 256: 248: 227: 194: 193: 105: 96: 86: 79: 72: 65: 53: 41:Please help 36:verification 33: 817:FlatBuffers 297:quantile(n) 995:Categories 928:ChromiumOS 902:TensorFlow 867:Kubernetes 723:OpenRefine 443:2018-06-18 384:References 280:maximum(n) 268:collection 224:Motivation 204:, used by 99:April 2011 69:newspapers 787:AngularJS 728:Tesseract 547:Newsqueak 476:MapReduce 309:unique(n) 291:sample(n) 238:MapReduce 214:MapReduce 128:Developer 943:Goobuntu 887:OR-Tools 713:Chromium 699:Software 571:Software 512:Rob Pike 363:See also 253:Features 243:Rob Pike 180:/archive 963:Code-in 956:Related 933:Fuchsia 892:Polymer 872:LevelDB 852:Guetzli 822:Flutter 812:Cpplint 802:Blockly 782:Angular 759:Sawzall 557:Sawzall 531:Inferno 195:Sawzall 176:.google 169:Website 158:License 147: ( 122:Sawzall 83:scholar 938:gLinux 877:libvpx 862:gVisor 832:Gerrit 744:Carbon 691:Google 303:top(n) 206:Google 133:Google 85:  78:  71:  64:  56:  857:Guice 837:Guava 827:Gears 797:Bazel 718:Gemma 649:UTF-8 632:Other 552:Limbo 90:JSTOR 76:books 923:AOSP 882:NaCl 847:Gson 842:gRPC 792:Beam 749:Dart 583:Blit 578:acme 343:and 184:/szl 178:.com 174:code 149:2003 142:2003 62:news 777:AMP 593:rio 588:sam 369:Pig 323:x86 274:sum 234:GFS 210:log 45:by 997:: 907:V8 754:Go 598:8½ 562:Go 416:^ 218:Go 182:/p 683:e 676:t 669:v 504:e 497:t 490:v 446:. 426:. 411:. 186:/ 151:) 112:) 106:( 101:) 97:( 87:· 80:· 73:· 66:· 39:.

Index


verification
improve this article
adding citations to reliable sources
"Sawzall" programming language
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
Developer
Google
License
Apache License 2.0
code.google.com/archive/p/szl/
domain-specific
programming language
Google
log
MapReduce
Go
Protocol Buffers
GFS
MapReduce
Rob Pike
x86
compound data types
recursive data structures
global variables

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.