Knowledge

:AutoWikiBrowser/Database Scanner - Knowledge

Source πŸ“

643:
searching if the secondary threads get too far behind. This happens if searching the article based on the search criteria is slower than reading the article from the XML file; typically this is the case. For the example of the Core i5 520M this does occur, database scanner performance is limited to how fast all the threads can search the articles, so overall performance is limited to the multi-threaded performance of the CPU.
31: 538:β€” Starts from an entered page name. The dump is scanned until the specified page is found, then the scan continues as normal using the other search settings. Scanning until a page is found is faster than scanning using the full settings, however the dump file up to the page has to be read, so this will still take time (approximately 30 seconds per gigabyte of XML data, depending on your system's CPU speed). 562: 112: 439: 210: 285: 235: 586: 333: 634:
So, with a reasonable 2010-era or later CPU, AWB will read the database XML dump file at around 30 MB/s and be CPU limited. Therefore, if reading the database file from a networked storage area, database scan performance will be reduced if the network transfer speed is below this speed. When reading
642:
file from disk, and additional thread(s) to search the articles based on the user's search criteria, total threads equalling the number of CPU cores (e.g. if quad core CPU without hyperthreading then 1 main and 3 secondary threads). The main thread will pause XML reading and contribute to article
217:
Select the namespaces you want to search within. If none are selected, the search will include all available namespaces. Please note that your dump file might not contain data for every namespace available on your wiki.
635:
the database XML dump file from a local disk, modern mechanical hard disks can normally provide sequential read speeds well above 30 MB/s, therefore the database scan speed will be CPU-limited.
50: 386:β€” Changes meaning of "^" and "$ " so they represent the beginning and end respectively of every line, rather than just of the entire string 362: 259: 91: 87: 544:β€” Limits the number of results that will be found displayed from the database dump. If the limit is reached the scan will stop early. 153:β€” use the Browse button to specify where on your machine the database dump (XML format, XML file) you have downloaded. (likely from 95: 79: 154: 61: 658:β€” allows you to filter the results found from the DB Dump. The options are the same for the normal AWB list filter 508: 252:β€” Restrict the search to titles NOT containing the text, or NOT matching the text if the Regex option is used. 646:
A CPU with more cores, and/or better performance from each core would improve database scanner performance.
701:β€” makes a list with ** before each page name, if placed on a wiki page, this will bullet point the lines 638:
The database scanner is multi-threaded: the database scanner uses the main thread to read the database
246:β€” Restrict the search to titles containing the text, or matching the text if the Regex option is used. 17: 744: 569:
Allows for pages with edit restrictions (semi-protected, fully protected etc.) to be searched for.
695:β€” makes a list with # before each page name, if placed on a wiki page, this will number the lines 501:β€” will search a database dump for any pages that have external links which are not bullet pointed 507:β€” allows you to search a database dump for spelling mistakes, in the same way that AWB can when 374:β€” Changes meaning of "." so it matches all characters, as opposed to all apart from newlines 182:β€” Software version that created the dump file. Example: "MediaWiki 1.43.0-wmf.24 (72fea51)". 606:
The speed of the database scanner mainly depends on two factors of the system it's run on:
519: 455:β€” will just list all the pages in the database dump (that match other scan filter criteria) 301:β€” Tick to restrict the search to pages with a revision (last edited) date between a range. 8: 56:
It may contain information that is out of date with the latest AutoWikiBrowser releases.
561: 400: 111: 719:β€” copies the list to the users clipboard for copying and pasting into another document 438: 209: 123:β€” Searches the selected database dump based on the settings set in other option boxes 284: 292: 355:β€”Β %%title%%,Β %%key%%,Β %%titlename%% andΒ %%namespace%% work if search is not regex 349:β€”Β %%title%%,Β %%key%%,Β %%titlename%% andΒ %%namespace%% work if search is not regex 234: 221: 585: 332: 160:
The following are automatically read from the header of the XML file specified.
529: 738: 446: 548: 340: 707:β€” adds headings == heading == for page names beginning with that letter 196: 466:β€” allows you to search a DB dump for links that can be simplified, e.g.: 601: 425: 60: 616: 271: 144: 624: 30: 674: 139: 572: 319: 267:β€” Whether the text/matching pattern should be case sensitive. 628: 639: 610: 649: 188:β€” Casing configuration of site. Example "first-letter". 596: 173: 736: 593:Some URL links to relevant dump help pages. 627:520M mobile CPU: maximum CPU usage and ~30 65:to improve the comprehension and quality. 14: 737: 731:β€” removes all pages from the page list 25: 725:β€” saves the list as a text document 664:β€” saves the list as a text document 23: 174:https://en.wikipedia.org/Main_Page 172:β€” Homepage of the site. Example: " 41:subsection of the user manual for 24: 756: 584: 560: 437: 331: 283: 233: 208: 110: 29: 394:Ignore <!-- comments --: --> 53:outside of the development team 683:β€” adds a heading every x lines 13: 1: 476:Simplifies links like ] to ]s 473:Simplifies links like ] to ] 7: 623:Example performance: Intel 613:single-threaded performance 464:Has links AWB will simplify 459:Has title AWB will embolden 10: 761: 670:β€” clears the list of pages 484:Has bad links AWB will fix 380:β€” Enables case sensitivity 18:Knowledge:AutoWikiBrowser 579:Show example screenshot 555:Show example screenshot 432:Show example screenshot 326:Show example screenshot 278:Show example screenshot 228:Show example screenshot 203:Show example screenshot 166:β€” Example: "Knowledge". 105:Show example screenshot 307:β€” Start date of range. 250:Title does not contain 687:Alphabetised headings 631:disk sequential read 401:Page text properties 313:β€” End date of range. 62:edit, add, or remove 51:community-maintained 92:Regular expressions 681:Add headings every 244:Title does contain 705:A B C... headings 101: 100: 71: 70: 752: 713:β€” makes the list 588: 564: 542:Limit results to 524: 518: 499:Unbulleted links 489:Has HTML entries 441: 335: 293:Last edited date 287: 237: 212: 114: 88:Find and replace 84:Database scanner 73: 72: 64: 46: 39:Database scanner 33: 26: 760: 759: 755: 754: 753: 751: 750: 749: 745:AutoWikiBrowser 735: 734: 677: 652: 604: 599: 591: 581: 580: 575: 567: 557: 556: 551: 536:Start from page 532: 522: 516: 449: 444: 434: 433: 428: 403: 343: 338: 328: 327: 322: 295: 290: 280: 279: 274: 240: 230: 229: 224: 215: 205: 204: 199: 147: 142: 117: 107: 106: 43:AutoWikiBrowser 36: 22: 21: 20: 12: 11: 5: 758: 748: 747: 733: 732: 726: 720: 714: 708: 702: 696: 690: 684: 676: 673: 672: 671: 665: 659: 651: 648: 621: 620: 614: 603: 600: 598: 595: 590: 589: 578: 577: 576: 574: 571: 566: 565: 554: 553: 552: 550: 547: 546: 545: 539: 531: 528: 527: 526: 512: 502: 496: 491: 486: 480: 479: 478: 477: 474: 468: 467: 461: 456: 448: 445: 443: 442: 431: 430: 429: 427: 424: 423: 422: 416: 410: 402: 399: 398: 397: 390: 389: 388: 387: 381: 378:Case sensitive 375: 366: 365: 363:AWB Regex help 356: 350: 342: 341:Text searching 339: 337: 336: 325: 324: 323: 321: 318: 317: 316: 315: 314: 308: 294: 291: 289: 288: 277: 276: 275: 273: 270: 269: 268: 265:Case sensitive 262: 260:AWB Regex help 253: 247: 239: 238: 227: 226: 225: 223: 222:Title matching 220: 214: 213: 202: 201: 200: 198: 195: 194: 193: 192: 191: 190: 189: 183: 177: 167: 146: 143: 141: 138: 137: 136: 130: 124: 116: 115: 104: 103: 102: 99: 98: 77: 69: 68: 67: 66: 57: 54: 34: 15: 9: 6: 4: 3: 2: 757: 746: 743: 742: 740: 730: 727: 724: 721: 718: 715: 712: 709: 706: 703: 700: 697: 694: 691: 688: 685: 682: 679: 678: 669: 666: 663: 660: 657: 654: 653: 647: 644: 641: 636: 632: 630: 626: 618: 615: 612: 609: 608: 607: 594: 587: 583: 582: 570: 563: 559: 558: 543: 540: 537: 534: 533: 530:Other options 525: 521: 513: 510: 506: 503: 500: 497: 495: 494:Section error 492: 490: 487: 485: 482: 481: 475: 472: 471: 470: 469: 465: 462: 460: 457: 454: 451: 450: 440: 436: 435: 420: 417: 414: 411: 408: 405: 404: 395: 392: 391: 385: 382: 379: 376: 373: 370: 369: 368: 367: 364: 360: 357: 354: 351: 348: 345: 344: 334: 330: 329: 312: 309: 306: 303: 302: 300: 297: 296: 286: 282: 281: 266: 263: 261: 257: 254: 251: 248: 245: 242: 241: 236: 232: 231: 219: 211: 207: 206: 187: 184: 181: 178: 175: 171: 168: 165: 162: 161: 159: 158: 156: 152: 151:Database file 149: 148: 134: 131: 128: 125: 122: 119: 118: 113: 109: 108: 97: 96:General fixes 93: 89: 85: 81: 78: 75: 74: 63: 59:Feel free to 58: 55: 52: 48: 47: 44: 40: 35: 32: 28: 27: 19: 728: 722: 716: 710: 704: 698: 692: 686: 680: 667: 661: 655: 645: 637: 633: 622: 605: 592: 568: 541: 535: 514: 509:RegexTypoFix 504: 498: 493: 488: 483: 463: 458: 452: 447:AWB specific 418: 412: 406: 393: 383: 377: 371: 358: 353:Not contains 352: 346: 310: 304: 298: 264: 255: 249: 243: 216: 185: 179: 169: 163: 150: 132: 126: 120: 83: 42: 38: 37:This is the 619:read speed. 602:Performance 549:Restriction 520:defaultsort 299:Search date 511:is enabled 407:Characters 372:Singleline 197:Namespaces 140:Parameters 76:Chapters: 617:hard disk 426:Searching 384:Multiline 180:Generator 164:Site name 739:Category 515:Missing 347:Contains 272:Revision 145:Database 675:Convert 650:Results 625:Core i5 656:Filter 597:Output 49:It is 729:Clear 668:Clear 419:Words 413:Links 359:Regex 256:Regex 133:Reset 127:Pause 121:Start 16:< 723:Save 717:Copy 711:Make 662:Save 629:MB/s 573:Help 505:Typo 453:None 320:Text 305:From 186:Case 170:Base 155:here 80:Core 640:XML 611:CPU 741:: 523:}} 517:{{ 361:β€” 311:To 258:β€” 176:". 157:) 94:Β· 90:Β· 86:Β· 82:Β· 699:* 693:# 689:β€” 421:β€” 415:β€” 409:β€” 396:β€” 135:β€” 129:β€” 45:.

Index

Knowledge:AutoWikiBrowser

community-maintained
edit, add, or remove
Core
Database scanner
Find and replace
Regular expressions
General fixes

here
https://en.wikipedia.org/Main_Page


AWB Regex help


AWB Regex help

RegexTypoFix
defaultsort


CPU
hard disk
Core i5
MB/s
XML
Category
AutoWikiBrowser

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑