Knowledge

Z-variant

Source 📝

1890: 1879: 151:. For example, the character 莊 has CCCII encoding 21552D, while its Z-variant 荘 has CCCII encoding 2D552D. Therefore, these two variants were given distinct Unicode code points, so that converting a CCCII document to Unicode and back would be a 101:Α) are represented by two distinct code points in Unicode, and might be termed "X-variants" (though this term is not common). The Y-axis represents significant differences in appearance though not in semantics; for example, the traditional 142:
Thus, were Han unification perfectly successful, Z-variants would not exist. They exist in Unicode because it was deemed useful to be able to "round-trip" documents between Unicode and other CJK encodings such as
139:
defines "Z-variant" as "Two CJK unified ideographs with identical semantics and unifiable shapes," where "unifiable" is taken in the sense of Han unification.
148: 284:"Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean" 782: 437: 1929: 724: 1847: 337: 1832: 1852: 642: 627: 871: 550: 866: 33: 400: 699: 330: 195:) described as "font variants," the term "Z-variant" being apparently reserved for interlanguage pairs such as the 704: 619: 520: 1181: 900: 1001: 815: 799: 762: 609: 545: 365: 1740: 1610: 925: 535: 1919: 1859: 976: 787: 587: 323: 82: 1539: 1026: 861: 856: 505: 410: 1544: 920: 1464: 741: 450: 555: 1341: 1894: 1795: 1715: 1151: 1056: 405: 393: 57:) if they share the same etymology but have slightly different appearances and different Unicode 29: 23: 1484: 1231: 1086: 1021: 736: 604: 510: 469: 241: 1680: 1434: 1429: 1306: 719: 484: 1386: 283: 119:
The Z-axis represents minor typographical differences. For example, the Chinese characters (
81:
The Unicode philosophy of code point allocation for CJK languages is organized along three "
1665: 1579: 1499: 1221: 1186: 1066: 8: 1710: 1620: 1564: 930: 910: 709: 694: 599: 525: 515: 1705: 1811: 1730: 1700: 1670: 1650: 1286: 1266: 1016: 582: 459: 455: 360: 225: 1780: 1690: 1675: 1514: 1479: 1291: 1131: 956: 767: 479: 420: 163:
There is some confusion over the exact definition of "Z-variant." For example, in an
102: 1206: 1924: 1883: 1842: 1735: 1685: 1574: 1534: 1459: 1449: 1439: 1311: 1296: 1201: 1176: 1051: 1031: 891: 777: 530: 489: 196: 168: 65:說 and U+8AAC 説 are Z-variants. The notion of Z-variance is only applicable to the " 1837: 1790: 1775: 1635: 1600: 1595: 1529: 1519: 1469: 1336: 1326: 1321: 1271: 1241: 1111: 1101: 1061: 961: 876: 851: 540: 445: 415: 70: 1046: 172: 1745: 1725: 1645: 1625: 1615: 1524: 1371: 1301: 1276: 1256: 1211: 1196: 1171: 1121: 1096: 1041: 996: 164: 94: 86: 1913: 1765: 1750: 1630: 1569: 1454: 1396: 1391: 1356: 1331: 1281: 1166: 1156: 1006: 951: 794: 592: 388: 301: 1760: 1494: 1489: 1444: 1346: 1261: 1251: 1161: 1136: 1076: 991: 971: 966: 946: 825: 772: 370: 1816: 1655: 1640: 1554: 1424: 1401: 1366: 1236: 1216: 1191: 1011: 474: 464: 746: 1695: 1146: 1036: 672: 380: 69:
scripts"—Chinese, Japanese, Korean and Vietnamese—and is a subtopic of
58: 221: 208: 192: 185: 1559: 1474: 1406: 1141: 915: 835: 830: 714: 264: 85:." The X-axis represents differences in semantics; for example, the 1785: 1755: 1605: 1590: 1585: 1376: 1126: 1106: 1071: 981: 820: 637: 152: 1226: 1509: 1504: 1381: 1316: 1246: 986: 346: 42: 1770: 1549: 1361: 1351: 1081: 667: 662: 632: 229: 1864: 1720: 1660: 1116: 1091: 657: 652: 647: 46: 144: 66: 315: 282:
Huang, K.; Ko, Y.; Konishi, K.; Qian, H. (April 2004).
890: 281: 1911: 212: 199: 176: 331: 76: 16:Glyphs with minor typographical differences 1833:Cultural, political, and religious symbols 338: 324: 232:database treats both pairs as Z-variants. 112:貓) and the simplified Chinese character ( 366:ISO/IEC 10646 (Universal Character Set) 34:question marks, boxes, or other symbols 1930:Computer-related introductions in 1991 1912: 259: 257: 61:. For example, the Unicode characters 889: 319: 867:International Components for Unicode 816:Common Locale Data Repository (CLDR) 254: 13: 1848:Mathematical operators and symbols 14: 1941: 1889: 1888: 1878: 1877: 1860:Phonetic symbols (including IPA) 294: 275: 1: 800:International Ideographs Core 610:International Ideographs Core 551:Alias names and abbreviations 247: 1022:CJK Unified Ideographs (Han) 872:People involved with Unicode 158: 7: 345: 235: 213: 200: 177: 127:荘) are Z-variants, as are ( 10: 1946: 862:Ideographic Research Group 857:ConScript Unicode Registry 1873: 1825: 1804: 1415: 939: 899: 885: 844: 808: 755: 742:Regional indicator symbol 685: 618: 575: 568: 498: 451:Combining grapheme joiner 436: 429: 379: 353: 77:Differences on the Z-axis 1895:Category: Unicode blocks 700:Compatibility characters 302:"Unihan Database Lookup" 175:) dated 2002, one finds 620:Comparison of encodings 546:Halfwidth and fullwidth 401:Universal Character Set 1545:Inscriptional Parthian 1232:Nyiakeng Puachue Hmong 894:and symbols in Unicode 511:CJK Unified Ideographs 242:Backward compatibility 22:This article contains 1681:Old Persian cuneiform 1540:Inscriptional Pahlavi 1435:Ancient North Arabian 1430:Anatolian hieroglyphs 720:Precomposed character 556:Whitespace characters 485:Zero-width non-joiner 1500:Egyptian hieroglyphs 705:Duplicate characters 521:Duplicate characters 135:説). The glossary at 1565:Khitan small script 1002:Canadian Aboriginal 737:Variation sequences 695:Combining character 605:Variation sequences 516:Combining character 211:) and the Japanese 116:猫) are Y-variants. 1920:Character encoding 1805:Notational scripts 1756:Tagalog (Baybayin) 1465:Caucasian Albanian 788:numeric references 763:Domain names (IDN) 583:Bidirectional text 460:Right-to-left mark 456:Left-to-right mark 411:Character property 361:Unicode Consortium 226:Unicode Consortium 24:special characters 1907: 1906: 1903: 1902: 1884:Category: Unicode 921:Punctuation marks 903:inherited scripts 809:Related standards 783:entity references 681: 680: 564: 563: 480:Zero-width joiner 103:Chinese character 30:rendering support 1937: 1892: 1891: 1881: 1880: 1843:Control Pictures 1796:Zanabazar Square 1535:Imperial Aramaic 1418:historic scripts 887: 886: 747:Emoji skin color 573: 572: 490:Zero-width space 434: 433: 421:Private Use Area 406:Character charts 340: 333: 326: 317: 316: 310: 309: 298: 292: 291: 279: 273: 272: 261: 224:). However, the 220: 216: 207: 203: 197:Mandarin Chinese 191: 184: 180: 134: 130: 126: 122: 115: 111: 100: 92: 64: 1945: 1944: 1940: 1939: 1938: 1936: 1935: 1934: 1910: 1909: 1908: 1899: 1869: 1853:List by subject 1826:Symbols, emojis 1821: 1800: 1716:Psalter Pahlavi 1417: 1411: 1272:Pracalit (Newa) 1087:Hanifi Rohingya 935: 911:Combining marks 902: 895: 881: 877:Han unification 840: 804: 751: 687: 677: 614: 560: 494: 438:Special purpose 425: 375: 349: 344: 314: 313: 306:www.unicode.org 300: 299: 295: 280: 276: 269:www.unicode.org 263: 262: 255: 250: 238: 218: 205: 189: 182: 161: 132: 128: 124: 120: 113: 109: 98: 97:capital alpha ( 90: 79: 71:Han unification 62: 53:(often spelled 49:are said to be 39: 38: 37: 28:Without proper 17: 12: 11: 5: 1943: 1933: 1932: 1927: 1922: 1905: 1904: 1901: 1900: 1898: 1897: 1886: 1874: 1871: 1870: 1868: 1867: 1862: 1857: 1856: 1855: 1845: 1840: 1835: 1829: 1827: 1823: 1822: 1820: 1819: 1814: 1808: 1806: 1802: 1801: 1799: 1798: 1793: 1788: 1783: 1778: 1773: 1768: 1763: 1758: 1753: 1748: 1743: 1738: 1733: 1728: 1723: 1718: 1713: 1708: 1703: 1698: 1693: 1688: 1683: 1678: 1673: 1668: 1663: 1658: 1653: 1648: 1643: 1638: 1633: 1628: 1623: 1618: 1613: 1608: 1603: 1598: 1593: 1588: 1583: 1577: 1572: 1567: 1562: 1557: 1552: 1547: 1542: 1537: 1532: 1527: 1522: 1517: 1512: 1507: 1502: 1497: 1492: 1487: 1482: 1477: 1472: 1467: 1462: 1457: 1452: 1447: 1442: 1437: 1432: 1427: 1421: 1419: 1413: 1412: 1410: 1409: 1404: 1399: 1394: 1389: 1384: 1379: 1374: 1369: 1364: 1359: 1354: 1349: 1344: 1339: 1334: 1329: 1324: 1319: 1314: 1309: 1307:Sorang Sompeng 1304: 1299: 1294: 1289: 1284: 1279: 1274: 1269: 1264: 1259: 1254: 1249: 1244: 1239: 1234: 1229: 1224: 1219: 1214: 1209: 1204: 1199: 1197:Miao (Pollard) 1194: 1189: 1184: 1179: 1174: 1169: 1164: 1159: 1154: 1149: 1144: 1139: 1134: 1129: 1124: 1119: 1114: 1109: 1104: 1099: 1094: 1089: 1084: 1079: 1074: 1069: 1064: 1059: 1054: 1049: 1044: 1039: 1034: 1029: 1024: 1019: 1014: 1009: 1004: 999: 994: 989: 984: 979: 974: 969: 964: 959: 954: 949: 943: 941: 940:Modern scripts 937: 936: 934: 933: 928: 923: 918: 913: 907: 905: 897: 896: 883: 882: 880: 879: 874: 869: 864: 859: 854: 848: 846: 845:Related topics 842: 841: 839: 838: 833: 828: 823: 818: 812: 810: 806: 805: 803: 802: 797: 792: 791: 790: 785: 775: 770: 765: 759: 757: 753: 752: 750: 749: 744: 739: 734: 729: 728: 727: 717: 712: 707: 702: 697: 691: 689: 683: 682: 679: 678: 676: 675: 670: 665: 660: 655: 650: 645: 640: 635: 630: 624: 622: 616: 615: 613: 612: 607: 602: 597: 596: 595: 585: 579: 577: 570: 566: 565: 562: 561: 559: 558: 553: 548: 543: 538: 533: 528: 523: 518: 513: 508: 502: 500: 496: 495: 493: 492: 487: 482: 477: 472: 467: 462: 453: 448: 442: 440: 431: 427: 426: 424: 423: 418: 413: 408: 403: 398: 397: 396: 385: 383: 377: 376: 374: 373: 368: 363: 357: 355: 351: 350: 343: 342: 335: 328: 320: 312: 311: 293: 288:tools.ietf.org 274: 252: 251: 249: 246: 245: 244: 237: 234: 165:Internet Draft 160: 157: 78: 75: 32:, you may see 20: 19: 18: 15: 9: 6: 4: 3: 2: 1942: 1931: 1928: 1926: 1923: 1921: 1918: 1917: 1915: 1896: 1887: 1885: 1876: 1875: 1872: 1866: 1863: 1861: 1858: 1854: 1851: 1850: 1849: 1846: 1844: 1841: 1839: 1836: 1834: 1831: 1830: 1828: 1824: 1818: 1815: 1813: 1810: 1809: 1807: 1803: 1797: 1794: 1792: 1789: 1787: 1784: 1782: 1779: 1777: 1776:Tulu Tigalari 1774: 1772: 1769: 1767: 1764: 1762: 1759: 1757: 1754: 1752: 1751:Sylheti Nagri 1749: 1747: 1744: 1742: 1741:South Arabian 1739: 1737: 1734: 1732: 1729: 1727: 1724: 1722: 1719: 1717: 1714: 1712: 1709: 1707: 1704: 1702: 1699: 1697: 1694: 1692: 1689: 1687: 1684: 1682: 1679: 1677: 1674: 1672: 1669: 1667: 1666:Old Hungarian 1664: 1662: 1659: 1657: 1654: 1652: 1649: 1647: 1644: 1642: 1639: 1637: 1634: 1632: 1629: 1627: 1624: 1622: 1619: 1617: 1614: 1612: 1609: 1607: 1604: 1602: 1599: 1597: 1594: 1592: 1589: 1587: 1584: 1581: 1578: 1576: 1573: 1571: 1568: 1566: 1563: 1561: 1558: 1556: 1553: 1551: 1548: 1546: 1543: 1541: 1538: 1536: 1533: 1531: 1528: 1526: 1523: 1521: 1518: 1516: 1513: 1511: 1508: 1506: 1503: 1501: 1498: 1496: 1493: 1491: 1488: 1486: 1483: 1481: 1478: 1476: 1473: 1471: 1468: 1466: 1463: 1461: 1458: 1456: 1453: 1451: 1448: 1446: 1443: 1441: 1438: 1436: 1433: 1431: 1428: 1426: 1423: 1422: 1420: 1414: 1408: 1405: 1403: 1400: 1398: 1395: 1393: 1390: 1388: 1385: 1383: 1380: 1378: 1375: 1373: 1370: 1368: 1365: 1363: 1360: 1358: 1355: 1353: 1350: 1348: 1345: 1343: 1340: 1338: 1335: 1333: 1330: 1328: 1325: 1323: 1320: 1318: 1315: 1313: 1310: 1308: 1305: 1303: 1300: 1298: 1295: 1293: 1290: 1288: 1285: 1283: 1280: 1278: 1275: 1273: 1270: 1268: 1265: 1263: 1260: 1258: 1255: 1253: 1250: 1248: 1245: 1243: 1240: 1238: 1235: 1233: 1230: 1228: 1225: 1223: 1220: 1218: 1215: 1213: 1210: 1208: 1205: 1203: 1200: 1198: 1195: 1193: 1190: 1188: 1187:Mende Kikakui 1185: 1183: 1182:Masaram Gondi 1180: 1178: 1175: 1173: 1170: 1168: 1167:Lisu (Fraser) 1165: 1163: 1160: 1158: 1155: 1153: 1150: 1148: 1145: 1143: 1140: 1138: 1135: 1133: 1130: 1128: 1125: 1123: 1120: 1118: 1115: 1113: 1110: 1108: 1105: 1103: 1100: 1098: 1095: 1093: 1090: 1088: 1085: 1083: 1080: 1078: 1075: 1073: 1070: 1068: 1067:Gunjala Gondi 1065: 1063: 1060: 1058: 1055: 1053: 1050: 1048: 1045: 1043: 1040: 1038: 1035: 1033: 1030: 1028: 1025: 1023: 1020: 1018: 1015: 1013: 1010: 1008: 1005: 1003: 1000: 998: 995: 993: 990: 988: 985: 983: 980: 978: 975: 973: 970: 968: 965: 963: 960: 958: 955: 953: 950: 948: 945: 944: 942: 938: 932: 929: 927: 924: 922: 919: 917: 914: 912: 909: 908: 906: 904: 898: 893: 888: 884: 878: 875: 873: 870: 868: 865: 863: 860: 858: 855: 853: 850: 849: 847: 843: 837: 834: 832: 829: 827: 824: 822: 819: 817: 814: 813: 811: 807: 801: 798: 796: 793: 789: 786: 784: 781: 780: 779: 776: 774: 771: 769: 766: 764: 761: 760: 758: 754: 748: 745: 743: 740: 738: 735: 733: 730: 726: 723: 722: 721: 718: 716: 713: 711: 708: 706: 703: 701: 698: 696: 693: 692: 690: 684: 674: 671: 669: 666: 664: 661: 659: 656: 654: 651: 649: 646: 644: 641: 639: 636: 634: 631: 629: 626: 625: 623: 621: 617: 611: 608: 606: 603: 601: 598: 594: 593:ISO/IEC 14651 591: 590: 589: 586: 584: 581: 580: 578: 574: 571: 567: 557: 554: 552: 549: 547: 544: 542: 539: 537: 534: 532: 529: 527: 524: 522: 519: 517: 514: 512: 509: 507: 504: 503: 501: 497: 491: 488: 486: 483: 481: 478: 476: 473: 471: 468: 466: 463: 461: 457: 454: 452: 449: 447: 444: 443: 441: 439: 435: 432: 428: 422: 419: 417: 414: 412: 409: 407: 404: 402: 399: 395: 392: 391: 390: 387: 386: 384: 382: 378: 372: 369: 367: 364: 362: 359: 358: 356: 352: 348: 341: 336: 334: 329: 327: 322: 321: 318: 307: 303: 297: 289: 285: 278: 270: 266: 260: 258: 253: 243: 240: 239: 233: 231: 227: 223: 215: 210: 202: 198: 194: 187: 179: 174: 170: 166: 156: 154: 150: 146: 140: 138: 117: 107: 104: 96: 88: 84: 74: 72: 68: 60: 56: 52: 48: 44: 35: 31: 27: 25: 1631:Meetei Mayek 1582:(Chorasmian) 1485:Cypro-Minoan 1262:Pahawh Hmong 1077:Gurung Khema 826:ISO/IEC 8859 731: 668:UTF-32/UCS-4 663:UTF-16/UCS-2 470:Variant form 305: 296: 287: 277: 268: 162: 141: 136: 118: 105: 80: 54: 50: 40: 21: 1817:SignWriting 1686:Old Sogdian 1656:Nandinagari 1580:Khwarezmian 1490:Dives Akuru 1416:Ancient and 1402:Warang Citi 1267:Pau Cin Hau 1222:New Tai Lue 1217:Nag Mundari 1192:Medefaidrin 901:Common and 710:Equivalence 688:code points 686:On pairs of 600:Equivalence 475:Word joiner 465:Soft hyphen 381:Code points 155:operation. 137:Unicode.org 93:A) and the 89:capital A ( 59:code points 1914:Categories 1711:Phoenician 1696:Old Uyghur 1691:Old Turkic 1676:Old Permic 1671:Old Italic 1621:Manichaean 1515:Glagolitic 1292:Saurashtra 1037:Devanagari 916:Diacritics 673:UTF-EBCDIC 576:Algorithms 569:Processing 506:Characters 430:Characters 265:"Glossary" 248:References 217:"rabbit" ( 204:"rabbit" ( 51:Z-variants 1706:ʼPhags-pa 1701:Palmyrene 1651:Nabataean 1575:Khudawadi 1560:Kharosthi 1475:Cuneiform 1450:Bhaiksuki 1445:Bassa Vah 1312:Sundanese 1287:Samaritan 1202:Mongolian 1177:Malayalam 1142:Kirat Rai 852:Anomalies 836:ISO 15924 831:DIN 91379 732:Z-variant 715:Homoglyph 588:Collation 193:不︀ 159:Confusion 55:zVariants 1838:Currency 1812:Duployan 1786:Vithkuqi 1781:Ugaritic 1636:Meroitic 1606:Mahajani 1591:Linear B 1586:Linear A 1377:Tifinagh 1342:Tai Viet 1337:Tai Tham 1327:Tagbanwa 1242:Ol Chiki 1132:Kayah Li 1127:Katakana 1112:Javanese 1107:Hiragana 1097:Hanunuoo 1072:Gurmukhi 1062:Gujarati 1052:Georgian 1027:Cyrillic 1017:Cherokee 982:Bopomofo 962:Balinese 957:Armenian 821:GB 18030 638:Punycode 526:Numerals 458: / 371:Versions 236:See also 153:lossless 131:說) and ( 123:莊) and ( 1925:Unicode 1746:Soyombo 1736:Sogdian 1731:Siddham 1726:Sharada 1646:Multani 1626:Marchen 1616:Mandaic 1611:Makasar 1525:Grantha 1510:Elymaic 1505:Elbasan 1480:Cypriot 1440:Avestan 1382:Tirhuta 1372:Tibetan 1317:Sunuwar 1302:Sinhala 1297:Shavian 1277:Ranjana 1257:Osmanya 1247:Ol Onal 1172:Lontara 1122:Kannada 1032:Deseret 997:Burmese 987:Braille 977:Bengali 931:Numbers 892:Scripts 541:Symbols 531:Scripts 354:Unicode 347:Unicode 188:) and ( 108:"cat" ( 43:Unicode 1893:  1882:  1791:Yezidi 1771:Todhri 1766:Tangut 1601:Lydian 1596:Lycian 1570:Khojki 1550:Kaithi 1530:Hatran 1520:Gothic 1470:Coptic 1460:Carian 1455:Brāhmī 1397:Wancho 1362:Thaana 1357:Telugu 1352:Tangsa 1332:Tai Le 1322:Syriac 1282:Rejang 1157:Lepcha 1102:Hebrew 1082:Hangul 1007:Chakma 952:Arabic 926:Spaces 633:CESU-8 628:BOCU-1 536:Spaces 230:Unihan 219:U+514E 206:U+5154 190:U+F967 183:U+4E0D 181:"no" ( 171:  133:U+8AAC 129:U+8AAA 125:U+8358 121:U+838A 114:U+732B 110:U+8C93 99:U+0391 91:U+0041 63:U+8AAA 47:glyphs 45:, two 1865:Emoji 1761:Takri 1721:Runic 1661:Ogham 1495:Dogra 1347:Tamil 1252:Osage 1227:Nüshu 1162:Limbu 1152:Latin 1137:Khmer 1117:Kanji 1092:Hanja 1057:Greek 1047:Geʽez 1042:Garay 992:Buhid 972:Batak 967:Bamum 947:Adlam 795:Input 773:Fonts 768:Email 756:Usage 658:UTF-8 653:UTF-7 648:UTF-1 499:Lists 416:Plane 389:Block 149:CCCII 95:Greek 87:Latin 1641:Modi 1555:Kawi 1425:Ahom 1387:Toto 1367:Thai 1237:Odia 1212:N'Ko 1012:Cham 778:HTML 725:list 643:SCSU 394:List 173:3743 167:(of 147:and 145:Big5 83:axes 67:CJKV 1392:Vai 1207:Mru 1147:Lao 446:BOM 228:'s 169:RFC 106:māo 41:In 1916:: 1407:Yi 304:. 286:. 267:. 256:^ 214:to 201:tù 178:bù 73:. 339:e 332:t 325:v 308:. 290:. 271:. 222:兎 209:兔 186:不 36:. 26:.

Index

special characters
rendering support
question marks, boxes, or other symbols
Unicode
glyphs
code points
CJKV
Han unification
axes
Latin
Greek
Chinese character
Big5
CCCII
lossless
Internet Draft
RFC
3743

不︀
Mandarin Chinese


Unicode Consortium
Unihan
Backward compatibility


"Glossary"
"Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean"

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.