Interpretable-NLP

There is a rise in interpretability in Natural Languages Processing research, but few resource is available online. This seminar aims at drawing a picture of interpretability in NLP research, and review interesting relevant topics.

Anyone interested in doing Interpretable NLP e.g., in SPOC lab is welcome.

Meeting Time & Location

Tentative Topics

  1. Introduction (Jan 12, 2021)
    • Why should we care about interpretable NLP?
      • We want to build NLP systems with better performance.
      • “Good performance” requires much more than just “high accuracy”.
      • We want to build NLP systems that deploy well in the society.
    • What do interpretable NLP research include?
      • Mainly about the ACL/EMNLP/NAACL track “Interpretability and analysis of NLP models”. BlackboxNLP workshop is also relevant.
      • Connection to: FAccT, theory, psycholinguistics, ML4H
  2. Background: language modeling, DNNs in NLP (Jan 19, 2021)
    • A view of NLP: it is a window for understanding knowledge & intelligence.
    • Many popular tasks and models (e.g., neural networks for language modeling) are developed along this goal: LSA, probabilistic neural LM, word2vec / GloVe, contextualized LM, …
  3. Background: Interpretability, explainable AI (Feb 2, 2021)
    • Some principles of model interpretability.
    • Some early methods to interpret models, including:
      • A local, (almost-) linear, post-hoc method: LIME
      • A method based on Shapley values: SHAP
      • Attention-based methods
      • SVCCA
    • Interpretability to humans might be complicated.
  4. Topic: The geometry of embeddings (Feb 9, 2021)
    • Three viewpoints in interpreting the geometry:
      • Linear Analogy
      • Anisotropy
      • Manifold
    • How can we use these understanding to build better embeddings?
      • New methods could benefit old models!
      • Consider frequency & isotropy might help
  5. Topic: Probing (Feb 16, 2021)
    • What is probe and how to probe?
      • Probe as a diagnostic classifier
        • Probe for semantic evidence, syntax, or other aspects in NLP pipeline
        • An information theory framework
      • Extend “probe” to e.g., without parameters
    • What can probes do?
      • Assess & remove bias
      • Assess the utility of features
  6. Topic: Psycholinguistic tests on NLP models (Feb 23, 2021)
    • Syntactic evaluation of LMs.
    • Pragmatic, semantic, and commonsense evaluations.
    • Specifically-designed tests (e.g., invariance tests).
  7. Topic: Spurious correlations, shortcut learning (March 2, 2021)
    • The “right for the wrong reasons” problem.
    • Solving this problem:
      • Changing dataset distributions.
      • Let models avoid the bias.
      • Train LMs on larger data.
  8. Topic: Influence of samples, understanding the datasets (March 16, 2021)
    • Perturbing the samples
    • Influence functions
    • Anchors, features, and adversarial examples
    • Studying the datasets

=== We are pausing this seminar. Look forward to resuming in summer 2021 ===

  1. Topic: Rationales in classification
  2. Topic: Causal interpretations
  3. Topic: Theoretical interpretations
  4. Topic: Fairness, Accountability, Transparency (FAccT)
  5. Topic: Adversarial Attacks, Robustness, OOD generalization
    • Adversarial attacks (Ref: Lecture 12 in CS 335)
    • Robustness
    • OOD generalization
      • The robust decision rules / “causal inductive biases” are those that are generalizable across domains
      • How can we accelerate progress towards human-like language generalization? (Linzen, 2020)
  6. Topic: Interpretability and Dialogue