Introduction to Chinese natural language processing

This book introduces Chinese language-processing issues and techniques to readers who already have a basic background in natural language processing (NLP). Since the major difference between Chinese and Western languages is at the word level, the book primarily focuses on Chinese morphological analy...

Full description

Bibliographic Details
Other Authors: Wong, Kam-Fai.
Format: Electronic
Language:English
Published: San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool Publishers, c2010.
Series:Synthesis lectures on human language technologies (Online), # 4.
Subjects:
Online Access:Abstract with links to full text
Table of Contents:
  • 1. Introduction
  • What is Chinese NLP
  • About this book
  • 2. Words in Chinese
  • Introduction
  • Characters, morphemes, and words
  • Characters
  • Morphemes
  • Words
  • Word formation in Chinese
  • Disyllabic compounds
  • Trisyllabic compounds
  • Quadrasyllabic compounds
  • Other morphological processes in Chinese
  • Ionization
  • Word identification and segmentation
  • Summary
  • 3. Challenges in Chinese morphological processing
  • Introduction
  • Chinese characters
  • Large number of characters
  • Simplified and traditional characters
  • Variant characters
  • Dialect characters and dialectal use of standard characters
  • Multiple character encoding standards
  • Textual conventions
  • Printing format
  • Punctuation practice
  • Linguistic characteristics
  • Few formal morphological markings
  • Parts of speech
  • Homonyms and homographs
  • Ambiguity
  • OOV words
  • Regional variation
  • Stylistic variation
  • Summary
  • 4. Chinese word segmentation
  • Introduction
  • Two main challenges
  • Algorithms
  • Character-based approach
  • Word-based approach
  • Word segmentation ambiguity
  • Ambiguity definition
  • Disambiguation algorithms
  • Benchmarks
  • Standards
  • Bakeoff evaluation
  • Free tools
  • Chinese lexical analysis system
  • MSRSeg system
  • Summary
  • 5. Unknown word identification
  • Introduction
  • Unknown word detection and recognition
  • Chinese person name identification
  • Chinese organization name identification
  • Chinese place name recognition
  • Summary.
  • 6. Word meaning
  • Basic meaning, concept, and relation
  • Introduction to Chinese natural language processing
  • Frame, collocation, and verb valency
  • Chinese dictionaries/lexicons
  • CILIN
  • WordNets
  • HowNet
  • Chinese concept dictionary
  • Summary
  • 7. Chinese collocations
  • Concept of collocation
  • Definitions
  • Collocations in Chinese
  • Qualitative properties
  • Quantitative features
  • Categorization
  • Linguistic resources
  • Collocation dictionary of modern Chinese lexical words
  • Dictionary of modern Chinese collocations
  • Chinese collocation bank
  • Applications
  • Summary
  • 8. Automatic Chinese collocation extraction
  • Introduction
  • Window-based statistical approach
  • Co-occurrence frequency features
  • Co-occurrence distribution features
  • Context-based features
  • Window size
  • Syntax-based approach
  • POS pattern filter
  • Dependency triple-based extraction
  • Using chunk information
  • Semantic-based approach
  • Synonym substitution test
  • Translation testing
  • Categorization-based approach
  • Benchmarks
  • Dictionary of modern Chinese collocations (electronic version)
  • Collocations for 100 Chinese headwords
  • Summary
  • Appendix
  • References
  • Author biographies.