The Chinese Character Wiki is a free and open source dictionary of Chinese characters, including stroke orders, pronunciations, definitions, examples, origins, and component breakdowns.
The dictionary currently contains manually verified information about
- The 1000 most common characters in movie subtitles
- The 1000 most common characters in books
- Characters from HSK 1-4
- Characters from the first 1000 Dong Chinese difficulty levels
(balanced list based on graded readers, frequency, and standardized tests)
- All the components of the characters in the above lists
Most Chinese characters are built from a combination of components. This wiki sorts characters into the following eight categories:
- Meaning component
- Sound component
- Iconic component
- Remnant component
- Simplified component
- Deleted component
- Distinguishing component
- Unknown component
These categories are different from the traditional 六书通 system. Many characters do not fit neatly into the traditional six categories.
Note that components are different from radicals. Radicals are traditionally used for organizing Chinese dictionaries, but are not always useful for understanding how characters are actually built.
A meaning or semantic component hints at the meaning of the character.
Meaning components are color-coded as red.
Historical shifts in meaning
For example, the character 错 originally meant to decorate something by inlaying it with gold or silver, which is why it contains the 金 (metal) component. Later this character expanded to include other meanings:
- interlocking pattern
- stagger / crossing
- complex / chaotic
- incorrect / mistake
- bad / wrong
(orig.) inlay with gold
A sound or phonetic component hints at how the character is pronounced.
Sound components are color-coded as blue.
Historical sound changes
Sometimes a character does not sound similar to its sound component. Most Chinese characters were invented thousands of years ago. Since then, there have been many changes to the way people speak. For that reason, the sound components of some characters are leftovers from old Chinese pronunciation, and do not reflect modern pronunciation.
For example, in old Chinese, 他 was pronounced /*l̥ʰaːl/ and 也 was pronounced /*laːlʔ/, so 也 was used as a sound component in 他. These two characters no longer sound similar.
Audio courtesy of the AllSet Learning Chinese Pronunciation Wiki, used with permission.
An iconic or form component is a direct visual representation of an object or idea (also known as a pictograph or ideograph).
Sound components are color-coded as green.
A remnant component is a component that is derived from a part of another character.
For example, the character 孝 (filial piety) is derived from a remnant of 老 (old), and 子 (child).
Remnant components are color-coded as chartreuse.
A simplified component is a component that was changed during character simplification to reduce the number of strokes.
Simplified components are color-coded as teal.
A deleted component is a component that was removed during character simplification to reduce the number of strokes.
A distinguishing component is a component that was added to distinguish one character from another character.
For example, the characters 王 (king) and 玉 (jade) were written similarly in seal script, so a dot was added to distinguish them.
Remnant components are color-coded as purple.
An unkown component is a component whose purpose is unclear. Unfortunately, not all Chinese characters have a clear explanation.
For example, nobody really knows for certain what the top component of 是 was originally supposed to represent.
Unknown components are color-coded as gray.
Sources of information
It is difficult to find reliable information about the origins of Chinese characters. Misinformation about Chinese characters is unfortunately very common, even from Chinese teachers, and it can be frustrating to wade through all of the conflicting information out there.
This is an update to the traditional Shuowen dictionary, with insights from modern analysis of recently discovered Oracle bone fragments that were unknown to ancient lexicographers.
- Outlier Dictionary of Chinese Characters
The authors of this dictionary are academic experts in Chinese paleography and have in-depth knowledge about the history of Chinese characters.
Usually pretty good
- 漢語多功能字庫 (Multi-function Chinese Character Database)
Free online dictionary provided by the University of Hong Kong, with explanations of character origins.
Dictionary of character origins from mainland China scholarship.
Useful for specific purposes
- Chinese Text Project
Free online database of ancient Chinese texts, useful for finding out how characters have been used historically, and finding references to more obscure characters.
- 小學堂 - Academia Sinica
Free online database of historical character forms.
Unreliable but occasionally useful
This is the traditional character dictionary that scholars have relied on for thousands of years. The information is often inaccurate, but it does provide valuable insight into how characters were written and understood at that point in history.
Wiktionary usually works decently for looking up the meaning of characters or historical/dialectical pronunciations, but is not always useful for finding out character origins.
The character builder is a tool for generating stroke data for obscure characters by combining strokes from other characters.
For example, if the database didn't already have stroke data for 犸, you could generate it from the first three strokes of 狼 and the last three strokes of 妈. If it doesn't line up quite right, you can move and stretch the components.
Characters in the list with a green checkmark are verified, which means they have been manually checked by a human to determine whether or not the information is correct.
Pages for characters that have not been manually verified yet will show a warning message at the top to indicate that the information may not be reliable.
If you see a mistake in the dictionary or want to help add more data, feel free to suggest edits or post on the talk page for a character.
Your edits must be approved first before they show up. If you have a track record of positive contributions, you will gain permission to edit without approval and to approve/reject edits from other people.
Dumps of the dictionary data are generated every month and are free to download.