Knowledge

Source–filter model

Source 📝

573: 557:
In implementation of the source–filter model of speech production, the sound source, or excitation signal, is often modelled as a periodic impulse train, for voiced speech, or white noise for unvoiced speech. The vocal tract filter is, in the simplest case, approximated by an all-pole filter, where
507:, they were able to predict the formant frequencies of different vowels, establishing a relationship between the two. Gunnar Fant, a pioneering speech scientist, used Chiba and Kajiyama's research involving X-ray photography of the vocal tract to interpret his own data of Russian speech sounds in 558:
the coefficients are obtained by performing linear prediction to minimize the mean-squared error in the speech signal to be reproduced. Convolution of the excitation signal with the filter response then produces the synthesised speech.
490:
An important assumption that is often made in the use of the source–filter model is the independence of source and filter. In such cases, the model should more accurately be referred to as the "independent source–filter model".
584:, which can produce a periodic sound when constricted or an aperiodic (white noise) sound when relaxed. The filter is the rest of the vocal tract, which can change shape through manipulation of the 17: 531:
in the time domain and by harmonics in the frequency domain, and a filter that depends on, for example, tongue position and lip protrusion. On the other hand,
480: 479:, have also contributed substantially to the models underlying acoustic analysis of speech and speech synthesis. Fant built off the work of 527:. Voiced sounds (e.g., vowels) have at least one source due to mostly periodic glottal excitation, which can be approximated by an 647: 812: 739: 434: 778: 770: 397: 535:, such as and , have at least one source due to turbulent noise produced at a constriction in the oral cavity or 667: 499:
In 1942, Chiba and Kajiyama published their research on vowel acoustics and the vocal tract in their book,
709:
Acoustic Theory of Speech Production with Calculations Based on X-ray Studies of Russian Articulations
831: 646:
Arai, Takayuki (2004). "History of Chiba and Kajiyama and their influence in modern speech science".
597: 366: 543:, such as and , have two sources - one at the glottis and one at the supra-glottal constriction. 427: 144: 567: 356: 115: 66: 371: 264: 91: 459:. While only an approximation, the model is widely used in a number of applications such as 609: 226: 47: 8: 532: 284: 216: 202: 135: 779:"The Chiba and Kajiyama book as a precursor to the acoustic theory of speech production" 794: 484: 476: 420: 410: 387: 289: 211: 71: 61: 31: 808: 766: 762: 735: 504: 468: 333: 254: 188: 120: 110: 80: 679: 552: 460: 183: 130: 105: 100: 798: 729: 725: 392: 221: 173: 125: 683: 621: 464: 572: 825: 346: 320: 294: 274: 471:. The development of the model is due, in large part, to the early work of 315: 279: 589: 581: 524: 472: 456: 452: 310: 178: 576:
One possible combination of source and filter in the human vocal tract.
528: 804: 649:
From Sound to Sense: 50+ Years of Discoveries in Speech Communication
605: 593: 250: 239: 168: 163: 153: 39: 523:
can be distinguished by the properties of their source(s) and their
608:, which travel through the vocal tract and are either amplified or 601: 731:
The Sounds of Language: An Introduction to Phonetics and Phonology
451:
represents speech as a combination of a sound source, such as the
585: 536: 520: 483:
and Masato Kajiyama, who first showed the relationship between a
361: 351: 269: 158: 193: 467:
because of its relative simplicity. It is also related to
668:"T. Chiba and M. Kajiyama, Pioneers in Speech Acoustics" 580:
In human speech production, the sound source is the
561: 761: (there were reprinted edition in 1952, and 823: 600:, respectively. The source produces a number of 503:. By creating models of the vocal tract using 753: 511:, which established the source–filter model. 428: 817:. (hardcover in 1999) / (paperback in 2000). 592:roughly compares the source and filter to 435: 421: 672:Journal of the Phonetic Society of Japan 571: 18:Source-filter model of speech production 793: 776: 758:. Tokyo: Tokyo-Kaiseikan Pub. Co., Ltd. 14: 824: 641: 639: 637: 724: 783:Journal of Phonetic Society of Japan 720: 718: 706: 702: 700: 665: 645: 509:Acoustic Theory of Speech Production 455:, and a linear acoustic filter, the 763:Japanese translated edition in 2003 756:The Vowel: Its Nature and Structure 634: 546: 501:The Vowel: Its nature and structure 24: 487:and the shape of the vocal tract. 25: 843: 715: 697: 754:Chiba, T.; Kajiyama, M. (1942). 562:Modeling human speech production 514: 659: 519:To varying degrees, different 13: 1: 627: 380:Theories of speech perception 7: 615: 612:to produce a speech sound. 588:, mouth, and nasal cavity. 485:vowel's acoustic properties 475:, although others, notably 10: 848: 684:10.24467/onseikenkyu.5.2_4 565: 550: 494: 734:. John Wiley & Sons. 367:Neural encoding of sound 777:Stevens, K. N. (2001). 145:Manners of articulation 577: 568:Articulatory phonetics 357:Categorical perception 92:Places of articulation 707:Fant, Gunnar (1970). 666:Fant, Gunnar (2001). 575: 265:Fundamental frequency 285:Source–filter theory 203:Airstream mechanisms 726:Zsiga, Elizabeth C. 655:. pp. 115–120. 449:source–filter model 800:Acoustic Phonetics 578: 411:Linguistics portal 388:Acoustic landmarks 48:Linguistics Series 814:978-0-262-19404-4 803:. Cambridge, MA: 741:978-1-118-34060-8 541:voiced fricatives 505:X-ray photography 469:linear prediction 445: 444: 405: 404: 328: 327: 234: 233: 16:(Redirected from 839: 832:Speech synthesis 818: 790: 759: 746: 745: 722: 713: 712: 704: 695: 694: 692: 690: 663: 657: 656: 654: 643: 553:Speech synthesis 547:Speech synthesis 461:speech synthesis 437: 430: 423: 341: 340: 247: 246: 88: 87: 27: 26: 21: 847: 846: 842: 841: 840: 838: 837: 836: 822: 821: 815: 760: 750: 749: 742: 723: 716: 705: 698: 688: 686: 664: 660: 652: 644: 635: 630: 618: 570: 564: 555: 549: 517: 497: 441: 393:Exemplar theory 303:Phonation types 23: 22: 15: 12: 11: 5: 845: 835: 834: 820: 819: 813: 795:Stevens, K. N. 791: 774: 748: 747: 740: 714: 696: 658: 632: 631: 629: 626: 625: 624: 622:Inverse filter 617: 614: 566:Main article: 563: 560: 551:Main article: 548: 545: 525:spectral shape 516: 513: 496: 493: 443: 442: 440: 439: 432: 425: 417: 414: 413: 407: 406: 403: 402: 401: 400: 395: 390: 382: 381: 377: 376: 375: 374: 369: 364: 359: 354: 349: 337: 336: 330: 329: 326: 325: 324: 323: 318: 313: 305: 304: 300: 299: 298: 297: 292: 287: 282: 277: 272: 267: 259: 258: 243: 242: 236: 235: 232: 231: 230: 229: 224: 219: 214: 206: 205: 199: 198: 197: 196: 191: 186: 181: 176: 171: 166: 161: 156: 148: 147: 141: 140: 139: 138: 133: 128: 123: 118: 113: 108: 103: 95: 94: 84: 83: 77: 76: 75: 74: 69: 64: 56: 55: 54:Subdisciplines 51: 50: 43: 42: 36: 35: 9: 6: 4: 3: 2: 844: 833: 830: 829: 827: 816: 810: 806: 802: 801: 796: 792: 788: 784: 780: 775: 772: 771:4-00-002107-9 768: 764: 757: 752: 751: 743: 737: 733: 732: 727: 721: 719: 711:. De Gruyter. 710: 703: 701: 685: 681: 677: 673: 669: 662: 651: 650: 642: 640: 638: 633: 623: 620: 619: 613: 611: 607: 603: 599: 595: 591: 587: 583: 574: 569: 559: 554: 544: 542: 538: 534: 530: 529:impulse train 526: 522: 512: 510: 506: 502: 492: 488: 486: 482: 481:Tsutomu Chiba 478: 474: 470: 466: 462: 458: 454: 450: 438: 433: 431: 426: 424: 419: 418: 416: 415: 412: 409: 408: 399: 396: 394: 391: 389: 386: 385: 384: 383: 379: 378: 373: 370: 368: 365: 363: 360: 358: 355: 353: 350: 348: 347:Acoustic cues 345: 344: 343: 342: 339: 338: 335: 332: 331: 322: 319: 317: 314: 312: 309: 308: 307: 306: 302: 301: 296: 295:Voicelessness 293: 291: 288: 286: 283: 281: 278: 276: 273: 271: 268: 266: 263: 262: 261: 260: 256: 252: 249: 248: 245: 244: 241: 238: 237: 228: 225: 223: 220: 218: 215: 213: 210: 209: 208: 207: 204: 201: 200: 195: 192: 190: 187: 185: 182: 180: 177: 175: 172: 170: 167: 165: 162: 160: 157: 155: 152: 151: 150: 149: 146: 143: 142: 137: 134: 132: 129: 127: 124: 122: 119: 117: 114: 112: 109: 107: 104: 102: 99: 98: 97: 96: 93: 90: 89: 86: 85: 82: 79: 78: 73: 70: 68: 65: 63: 60: 59: 58: 57: 53: 52: 49: 45: 44: 41: 38: 37: 33: 29: 28: 19: 799: 786: 782: 755: 730: 708: 687:. Retrieved 675: 671: 661: 648: 598:articulation 579: 556: 540: 539:. So-called 518: 515:Applications 508: 500: 498: 489: 448: 446: 398:Motor theory 280:Pitch accent 116:Postalveolar 81:Articulation 67:Articulatory 46:Part of the 604:of varying 582:vocal folds 477:Ken Stevens 473:Gunnar Fant 463:and speech 457:vocal tract 453:vocal cords 179:Approximant 628:References 610:attenuated 606:amplitudes 533:fricatives 334:Perception 227:Percussive 805:MIT Press 789:(2): 6–7. 602:harmonics 594:phonation 251:Phonation 240:Acoustics 217:Glottalic 169:Fricative 164:Affricate 154:Consonant 136:Laryngeal 40:Phonetics 826:Category 797:(1998). 728:(2012). 616:See also 521:phonemes 465:analysis 212:Pulmonic 111:Alveolar 72:Auditory 62:Acoustic 32:a series 30:Part of 586:pharynx 537:pharynx 495:History 372:Prosody 362:Hearing 352:Aphasia 321:Breathy 270:Glottis 255:Voicing 222:Lingual 189:Lateral 159:Plosive 121:Palatal 811:  769:  738:  689:3 July 316:Creaky 184:Liquid 131:Uvular 106:Dental 101:Labial 678:(2). 653:(PDF) 311:Modal 275:Pitch 194:Vowel 174:Nasal 126:Velar 809:ISBN 767:ISBN 736:ISBN 691:2020 596:and 590:Fant 447:The 290:Tone 765:as 680:doi 828:: 807:. 785:. 781:. 717:^ 699:^ 674:. 670:. 636:^ 34:on 787:5 773:) 744:. 693:. 682:: 676:5 436:e 429:t 422:v 257:) 253:( 20:)

Index

Source-filter model of speech production
a series
Phonetics
Linguistics Series
Acoustic
Articulatory
Auditory
Articulation
Places of articulation
Labial
Dental
Alveolar
Postalveolar
Palatal
Velar
Uvular
Laryngeal
Manners of articulation
Consonant
Plosive
Affricate
Fricative
Nasal
Approximant
Liquid
Lateral
Vowel
Airstream mechanisms
Pulmonic
Glottalic

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.