This tool makes learning to read Chinese easier by automatically marking up the words in a simplified Chinese text with their pronunciations and dictionary definitions. You can type or paste in GB-encoded text or the address of a Chinese web page. You have several choices of how the text will be annotated:
The user can add their own definition in this format:
Chinese [pinyin] /English definition/
That is, the Chinese (no internal spaces), followed by one space (not a wide Chinese space), followed by the pinyin surrounded by square brackets (with a space between each pinyin syllable), followed by another space, followed the English definition/explanation surrounded by slashes (this is the CEDICT format). One word or definition per line.
Users can use this to override the CEDICT definitions in the other modes if the definitions or romanizations have mistakes.
When using "Add to Margins" the first time the word occurs it will be in bold. Its definition will appear more or less to its right. Right now it is set up to try match words to paragraphs, so be sure to have at least one blank line between paragraphs.
Users can add words or definitions to the "Words to Annotate" section to supplement or override the existing dictionary. Use the CEDICT format for entering entries.
The tool currently only handles GB-encoded text. Dictionary definitions are drawn from Paul Denisowski's CEDICT Chinese-English dictionary. If you find a word that does not have a definition, consider contributing it to the CEDICT project. The segmentation algorithm is still under development. Just what constitutes a "proper" Chinese word is also a good research topic. You can download the segmenter code (in perl) and run it yourself.
This page is a mirror of the annotator formerly available at Erik Peterson's On-line Chinese Tools site.
This page was last updated on 01/05/03.