Knowledge

Watchdog timer

Source đź“ť

280:. For example, the block diagram below shows a three-stage watchdog. In a multistage watchdog, only the first stage is kicked by the processor. Upon first stage timeout, a corrective action is initiated and the next stage in the cascade is started. As each subsequent stage times out, it triggers a corrective action and starts the next stage. Upon final stage timeout, a corrective action is initiated, but no other stage is started because the end of the cascade has been reached. Typically, single-stage watchdog timers are used to simply restart the computer, whereas multistage watchdog timers will sequentially trigger a series of corrective actions, with the final stage triggering a computer restart. 257: 284: 351: 91: 148: 324:, or combinations of these. Depending on its architecture, the type of corrective action or actions that a watchdog can trigger may be fixed or programmable. Some computers (e.g., PC compatibles) require a pulsed signal to invoke a hardware reset. In such cases, the watchdog typically triggers a hardware reset by activating an internal or external pulse generator, which in turn creates the required reset pulses. 394:
multistage watchdog timer in which the software comprises the first and intermediate timer stages and the hardware reset the final stage. In a Linux system, for example, the watchdog daemon could attempt to perform a software-initiated restart, which can be preferable to a hardware reset as the file systems will be safely
241:
delay allows time for the computer to boot before the watchdog is enabled. Without this delay, the watchdog would timeout and invoke a subsequent reset before the computer can run its application software — the software which kicks the watchdog — and the system would become stuck in an endless cycle of incomplete reboots.
358:
event will start the Stage2 timer and, simultaneously, notify the computer (by means of a non-maskable interrupt) that a reset is imminent. Until Stage2 times out, the computer may attempt to record state information, debug information, or both. As a last resort, the computer will be reset upon Stage2 timeout.
554:
A watchdog timer (or computer operating properly (COP) timer) is a computer hardware or software timer that triggers a system reset or other corrective action if the main program, due to some fault condition, such as a hang, neglects to regularly service the watchdog (writing a "service pulse" to it,
357:
For example, the above diagram shows a likely configuration for a two-stage watchdog timer. During normal operation the computer regularly kicks Stage1 to prevent a timeout. If the computer fails to kick Stage1 (e.g., due to a hardware fault or programming error), Stage1 will eventually timeout. This
240:
When automatically generated, the enabling signal is typically derived from the computer reset signal. In some systems the reset signal is directly used to enable the watchdog. In others, the reset signal is delayed so that the watchdog will become enabled at some later time following the reset. This
366:
A watchdog timer provides automatic detection of catastrophic malfunctions that prevent the computer from kicking it. However, computers often have other, less-severe types of faults which do not interfere with kicking, but which still require watchdog oversight. To support these, a computer system
346:
medium. In such cases, a second timer—which is started when the first timer elapses—is typically used to reset the computer later, after allowing sufficient time for data recording to complete. This allows time for the information to be saved, but ensures that the computer will be reset even if the
295:
Watchdog timers may have either fixed or programmable time intervals. Some watchdog timers allow the time interval to be programmed by selecting from among a few selectable, discrete values. In others, the interval can be programmed to arbitrary values. Typically, watchdog time intervals range from
393:
Upon discovery of a failed test, the computer may attempt to perform a sequence of corrective actions under software control, culminating with a software-initiated reboot. If the software fails to invoke a reboot, the watchdog timer will timeout and invoke a hardware reset. In effect, this is a
215: 205:
Some watchdog timers will only allow kicks during a specific time window. The window timing is usually relative to the previous kick or, if the watchdog has not yet been kicked, to the moment the watchdog was enabled. The window begins after a delay following the previous kick, and ends after a
63:
During normal operation, the computer regularly restarts the watchdog timer to prevent it from elapsing, or "timing out". If, due to a hardware fault or program error, the computer fails to restart the watchdog, the timer will elapse and generate a timeout signal. The timeout signal is used to
374:, a single, simple test might be insufficient to guarantee normal operation, as it could fail to detect a subtle fault condition and therefore allow the watchdog to be kicked even though a fault condition exists. For example, in the case of the Linux operating system, a user-space watchdog 335:) to prevent injuries and equipment damage while the fault persists. In a two-stage watchdog, the first timer is often used to activate fail-safe outputs and start the second timer stage; the second stage will reset the computer if the fault cannot be corrected before the timer elapses. 367:
is typically designed so that its watchdog timer will be kicked only if the computer deems the system functional. The computer determines whether the system is functional by conducting one or more fault detection tests and will kick the watchdog only if all tests have passed.
236:
when idle. Upon power-up, a watchdog may be unconditionally enabled or it may be initially disabled and require an external signal to enable it. In the latter case, the enabling signal may be automatically generated by hardware or it may be generated under software control.
390:, reasonable CPU time), evidence of expected process activity (e.g., system daemons running, specific files being present or updated), overheating, and network activity, and system-specific test scripts or programs can also be run. 118:
and other automated machines, a fault in the control computer could cause equipment damage or injuries before a human could react, even if the computer is easily accessed. A watchdog timer is usually employed in cases like these.
105:
and other computer-controlled equipment where humans cannot easily access the equipment or would be unable to react to faults in a timely manner. In such systems, the computer cannot depend on a human to invoke a reboot if it
206:
further delay. If the computer attempts to kick the watchdog before or after the window, the watchdog will not be restarted, and in some implementations this will be treated as a fault and trigger corrective action.
555:
also referred to as "kicking the dog", "petting the dog", "feeding the watchdog" or "triggering the watchdog"). The intention is to bring the system back from the nonresponsive state into normal operation.
59:
malfunctions. Watchdog timers are widely used in computers to facilitate automatic correction of temporary hardware faults, and to prevent errant or malevolent software from disrupting system operation.
134:, a watchdog timer may be used to monitor a time-critical task to ensure it completes within its maximum allotted time and, if it fails to do so, to terminate the task and report the failure. 398:
and fault information will be logged. It is essential, however, to have the insurance provided by a hardware timer, since a software restart can fail under a number of fault conditions.
122:
Watchdog timers are also used to monitor and limit software execution time on a normally functioning computer. For example, a watchdog timer may be used when running untrusted code in a
20: 253:
as shown in the block diagram below, or they may have independent clock signals. A basic watchdog timer has a single timer stage which, upon timeout, typically will reset the CPU:
542: 378:
may simply kick the watchdog periodically without performing any tests. As long as the daemon runs normally, the system will be protected against serious system crashes such as a
202:. The device driver, which serves to abstract the watchdog hardware from user space programs, may also be used to configure the time-out period and start and stop the timer. 64:
initiate corrective actions. The corrective actions typically include placing the computer and associated hardware in a safe state and invoking a computer
151:
Some watchdog timers only allow kicks during a time window. Kicks occurring outside the window have no effect on the timer and may be treated as faults.
114:
are not physically accessible to human operators; these could become permanently disabled if they were unable to autonomously recover from faults. In
249:
Watchdog timers come in many configurations, and many allow their configurations to be altered. For example, the watchdog and CPU may share a common
721: 678: 27:(Texas Instruments TPS3823). One pin receives the timer restart ("kick") signal from the computer; another pin outputs the timeout signal. 74:
often include an integrated, on-chip watchdog. In other computers the watchdog may reside in a nearby chip that connects directly to the
532: 331:
circuitry. When activated, the fail-safe circuitry forces all control outputs to safe states (e.g., turns off motors, heaters, and high-
580: 338:
Watchdog timers are sometimes used to trigger the recording of system state information—which may be useful during fault recovery—or
382:. To detect less severe faults, the daemon can be configured to perform tests that cover resource availability (e.g., sufficient 194:
program will kick the watchdog by interacting with the watchdog device driver, typically by writing a zero character to
171:
instruction. An example of this is the CLRWDT (clear watchdog timer) instruction found in the instruction set of some
296:
ten milliseconds to a minute or more. In a multistage watchdog, each timer may have its own, unique time interval.
343: 506:
watchdog timer is effectively a built-in extension of the processor and, as such, may be accessed by special
752: 123: 682: 131: 768: 717: 636: 610: 407: 127: 708: 413: 309: 187: 95: 430: 8: 371: 172: 167:. Alternatively, some tightly coupled watchdog timers are kicked by executing a special 375: 305: 24: 395: 164: 160: 566: 327:
In embedded systems and control systems, watchdog timers are often used to activate
507: 304:
A watchdog timer may initiate any of several types of corrective action, including
179: 168: 107: 383: 102: 71: 567:"The Grenade Timer: Fortifying the Watchdog Timer Against Malicious Mobile Code" 342:
information (which may be useful for determining the cause of the fault) onto a
424: 313: 79: 457:
Various terms are used for the act of restarting a watchdog timer. Some (e.g.
762: 418: 321: 183: 126:, to limit the CPU time available to the code and thus prevent some types of 387: 379: 250: 159:
the watchdog. Kicking is typically done by writing to a watchdog control
111: 110:; it must be self-reliant. For example, remote embedded systems such as 94:
Watchdog timers are essential in remote, automated systems such as this
746: 191: 657: 256: 16:
Electronic timer used to detect and recover from computer malfunctions
474: 435: 328: 317: 19: 350: 214: 56: 155:
The act of restarting a watchdog timer is commonly referred to as
90: 538: 332: 283: 147: 65: 339: 220: 199: 115: 52: 578: 75: 370:
In computers that are running an operating system and
707:"Section 9. Watchdog, Deadman, and Power-up Timers". 268:
Two or more timers are sometimes cascaded to form a
579:Murphy, Niall & Barr, Michael (October 2001). 182:, watchdog restarts are usually invoked through a 410:a related method to keep a spacecraft commandable 760: 137: 524: 608: 569:by Frank Stajano and Ross Anderson (2000). 676: 244: 213: 146: 89: 55:that is used to detect and recover from 18: 611:"Single and Multistage Watchdog Timers" 604: 602: 600: 598: 272:, where each timer is referred to as a 761: 510:instructions which are specific to it. 263: 224:, a program that shows watchdog status 101:Watchdog timers are commonly found in 78:, or it may be located on an external 453: 451: 299: 595: 163:or by setting a particular bit in a 634: 531:"4.11 Dual Staged Watchdog Timer". 13: 700: 572: 534:Kontron User's Guide - COMe-cBTi6R 448: 361: 349: 282: 255: 14: 780: 753:Arduino Watchdog Timer with Reset 740: 290: 45:computer operating properly timer 679:"Linux Watchdog - General Tests" 51:), is an electronic or software 755:- Article by Adityapratap Singh 727:from the original on 2024-01-10 637:"The Linux Watchdog driver API" 545:from the original on 2023-09-23 228:A watchdog timer is said to be 85: 670: 650: 628: 583:. Embedded Systems Programming 560: 496: 178:In computers that are running 1: 710:PIC32 Family Reference Manual 677:Crawford, Paul (2013-09-05). 517: 142: 489:) do not. This article uses 7: 401: 209: 132:real-time operating systems 82:in the computer's chassis. 10: 785: 198:or by calling a KEEPALIVE 138:Architecture and operation 749:– Article by Jack Ganssle 747:Building a great watchdog 718:Microchip Technology Inc. 537:. Document Revision 1.0. 347:recording process fails. 270:multistage watchdog timer 128:denial-of-service attacks 441: 408:Command Loss Timer Reset 477:, whereas others (e.g. 473:) draw a connection to 414:Safe mode (spacecraft) 354: 310:non-maskable interrupt 287: 260: 225: 188:Linux operating system 186:. For example, in the 152: 98: 96:Mars Exploration Rover 43:), sometimes called a 28: 658:"Watchdog 'man' page" 431:Heartbeat (computing) 353: 286: 259: 245:Single-stage watchdog 217: 150: 93: 22: 173:PIC microcontrollers 720:2013. DS60001114G. 635:Weingel, Christer. 264:Multistage watchdog 232:when operating and 372:multiple processes 355: 320:state activation, 306:maskable interrupt 300:Corrective actions 288: 261: 226: 153: 99: 29: 25:integrated circuit 581:"Watchdog Timers" 180:operating systems 23:A watchdog timer 776: 769:Embedded systems 735: 733: 732: 726: 715: 694: 693: 691: 690: 681:. Archived from 674: 668: 667: 665: 664: 654: 648: 647: 645: 643: 632: 626: 625: 623: 621: 615: 609:Lamberson, Jim. 606: 593: 592: 590: 588: 576: 570: 564: 558: 557: 551: 550: 528: 511: 508:machine language 500: 494: 493:for consistency. 455: 223: 197: 169:machine language 103:embedded systems 72:Microcontrollers 784: 783: 779: 778: 777: 775: 774: 773: 759: 758: 743: 730: 728: 724: 713: 706: 703: 701:Further reading 698: 697: 688: 686: 675: 671: 662: 660: 656: 655: 651: 641: 639: 633: 629: 619: 617: 613: 607: 596: 586: 584: 577: 573: 565: 561: 548: 546: 530: 529: 525: 520: 515: 514: 504:tightly coupled 501: 497: 456: 449: 444: 404: 364: 362:Fault detection 302: 293: 266: 247: 219: 212: 195: 145: 140: 88: 17: 12: 11: 5: 782: 772: 771: 757: 756: 750: 742: 741:External links 739: 738: 737: 702: 699: 696: 695: 669: 649: 627: 594: 571: 559: 552:. p. 39: 522: 521: 519: 516: 513: 512: 495: 446: 445: 443: 440: 439: 438: 433: 428: 425:Power-up timer 422: 416: 411: 403: 400: 363: 360: 314:hardware reset 301: 298: 292: 291:Time intervals 289: 276:, or simply a 265: 262: 246: 243: 218:Screenshot of 211: 208: 144: 141: 139: 136: 87: 84: 80:expansion card 39:, or simply a 33:watchdog timer 15: 9: 6: 4: 3: 2: 781: 770: 767: 766: 764: 754: 751: 748: 745: 744: 723: 719: 712: 711: 705: 704: 685:on 2013-09-14 684: 680: 673: 659: 653: 638: 631: 612: 605: 603: 601: 599: 582: 575: 568: 563: 556: 544: 540: 536: 535: 527: 523: 509: 505: 499: 492: 488: 484: 480: 476: 472: 468: 464: 460: 454: 452: 447: 437: 434: 432: 429: 426: 423: 420: 419:Deadman timer 417: 415: 412: 409: 406: 405: 399: 397: 391: 389: 385: 381: 377: 373: 368: 359: 352: 348: 345: 341: 336: 334: 330: 325: 323: 322:power cycling 319: 315: 311: 307: 297: 285: 281: 279: 275: 271: 258: 254: 252: 242: 238: 235: 231: 222: 216: 207: 203: 201: 196:/dev/watchdog 193: 189: 185: 184:device driver 181: 176: 174: 170: 166: 162: 158: 149: 135: 133: 129: 125: 120: 117: 113: 109: 104: 97: 92: 83: 81: 77: 73: 69: 67: 61: 58: 54: 50: 46: 42: 38: 34: 26: 21: 729:. Retrieved 709: 687:. Retrieved 683:the original 672: 661:. Retrieved 652: 640:. Retrieved 630: 620:10 September 618:. Retrieved 585:. Retrieved 574: 562: 553: 547:. Retrieved 533: 526: 503: 498: 490: 486: 482: 478: 470: 466: 462: 458: 392: 388:file handles 380:kernel panic 369: 365: 356: 337: 326: 303: 294: 277: 273: 269: 267: 251:clock signal 248: 239: 233: 229: 227: 204: 177: 156: 154: 121: 112:space probes 100: 86:Applications 70: 62: 48: 44: 40: 36: 32: 30: 587:18 February 274:timer stage 736:(26 pages) 731:2024-01-10 689:2013-09-10 663:2013-09-10 642:20 January 616:. Sensoray 549:2023-09-23 518:References 475:guard dogs 344:persistent 192:user space 143:Restarting 436:Keepalive 396:unmounted 329:fail-safe 318:fail-safe 49:COP timer 763:Category 722:Archived 543:Archived 541:. 2021. 402:See also 333:voltages 234:disabled 210:Enabling 165:register 57:computer 41:watchdog 539:Kontron 230:enabled 157:kicking 124:sandbox 471:tickle 427:(PWRT) 384:memory 376:daemon 116:robots 66:reboot 725:(PDF) 714:(PDF) 614:(PDF) 487:reset 442:Notes 421:(DMT) 340:debug 278:stage 221:wdctl 200:ioctl 130:. In 108:hangs 53:timer 644:2021 622:2013 589:2013 491:kick 483:ping 467:feed 459:kick 386:and 190:, a 161:port 479:tag 463:pet 76:CPU 37:WDT 765:: 716:. 597:^ 502:A 485:, 481:, 469:, 465:, 461:, 450:^ 316:, 312:, 308:, 175:. 68:. 31:A 734:. 692:. 666:. 646:. 624:. 591:. 47:( 35:(

Index


integrated circuit
timer
computer
reboot
Microcontrollers
CPU
expansion card

Mars Exploration Rover
embedded systems
hangs
space probes
robots
sandbox
denial-of-service attacks
real-time operating systems

port
register
machine language
PIC microcontrollers
operating systems
device driver
Linux operating system
user space
ioctl

wdctl
clock signal

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑