History

  • Symbolic (象征) NLP (1950s – early 1990s)
    • 基于明确的规则和逻辑推理
    • 可解释性强,能理解复杂语言逻辑
    • 很难写全所有规则,应对不了语言中的灵活表达
  • Statistical (统计) NLP (1990s–2010s)
    • 依赖大量数据和概率计算
    • 适合处理大规模数据
    • 需要大量训练数据,结果不一定能解释清楚
  • Neural (神经) NLP (present)
    • 模拟人脑工作原理,通过层层计算“自动学习”语言
    • 适合复杂任务,比如翻译、聊天、生成文章
    • 计算量大,训练时间长,结果不容易解释

Common tasks

  • Text and speech processing
    • Optical (光学) character recognition (OCR)
    • Speech recognition
    • Speech segmentation
    • Text-to-speech
    • Word segmentation (Tokenization) 单词分割(令牌化)
  • Morphological (形态) analysis
    • Lemmatization
    • Morphological segmentation
    • Part-of-speech tagging
    • Stemming
  • Syntactic(句法) analysis
    • Grammar induction
    • Sentence breaking (also known as “sentence boundary disambiguation”)
    • Parsing
  • Lexical semantics (of individual words in context)
    • Lexical semantics
    • Distributional semantics
    • Named entity recognition (NER)
    • Sentiment (情感) analysis (see also Multimodal sentiment analysis)
    • Terminology (术语) extraction
    • Word-sense disambiguation (WSD)
    • Entity linking
  • Relational semantics (semantics of individual sentences)
    • Relationship extraction
    • Semantic parsing
    • Semantic role labelling (see also implicit semantic role labelling below)
  • Discourse (semantics beyond individual sentences)
    • Coreference resolution
    • Discourse analysis
    • Implicit semantic role labelling
    • Recognizing textual entailment
    • Topic segmentation and recognition
    • Argument mining
  • Higher-level NLP applications
    • Automatic summarization (text summarization)
    • Grammatical error correction
    • Logic translation
    • Machine translation (MT)
    • Natural-language understanding (NLU)
    • Natural-language generation (NLG):
    • Book generation
    • Document AI
    • Dialogue management
    • Question answering
    • Text-to-image generation
    • Text-to-scene generation
    • Text-to-video
    • llm

References