Interpretable NLP 2021 winter

In winter 2021, we ran a seminar discussing interesting papers related to interpretability in NLP. Here are the list of topics, with links to slides. This is the first iteration; I expect there will be a refined version in the future.

  1. Introduction (2021-01-12) slides

    • Why should we care about interpretable NLP?
      • We want to build NLP systems with better performance.
      • “Good performance” requires much more than just “high accuracy”.
      • We want to build NLP systems that deploy well in the society.
    • What do interpretable NLP research include?
      • Mainly about the ACL/EMNLP/NAACL track “Interpretability and analysis of NLP models”. BlackboxNLP workshop is also relevant.
      • Connection to: FAccT, theory, psycholinguistics, ML4H
  2. Background: language modeling, DNNs in NLP (Jan 19, 2021) slides

    • A view of NLP: it is a window for understanding knowledge & intelligence.
    • Many popular tasks and models (e.g., neural networks for language modeling) are developed along this goal: LSA, probabilistic neural LM, word2vec / GloVe, contextualized LM, …
  3. Background: Interpretability, explainable AI (Feb 2, 2021) slides

    • Some principles of model interpretability.
    • Some early methods to interpret models, including:
      • A local, (almost-) linear, post-hoc method: LIME
      • A method based on Shapley values: SHAP
      • Attention-based methods
      • SVCCA
    • Interpretability to humans might be complicated.
  4. Topic: The geometry of embeddings (Feb 9, 2021) slides

    • Three viewpoints in interpreting the geometry:
      • Linear Analogy
      • Anisotropy
      • Manifold
    • How can we use these understanding to build better embeddings?
      • New methods could benefit old models!
      • Consider frequency & isotropy might help
  5. Topic: Probing (Feb 16, 2021) slides

    • What is probe and how to probe?
      • Probe as a diagnostic classifier
        • Probe for semantic evidence, syntax, or other aspects in NLP pipeline
        • An information theory framework
      • Extend “probe” to e.g., without parameters
    • What can probes do?
      • Assess & remove bias
      • Assess the utility of features
  6. Topic: Behavioral tests on NLP models (Feb 23, 2021) slides

    • Syntactic evaluation of LMs.
    • Pragmatic, semantic, and commonsense evaluations.
    • Specifically-designed tests (e.g., invariance tests).
  7. Topic: Spurious correlations, shortcut learning (March 2, 2021) slides

    • The “right for the wrong reasons” problem.
    • Solving this problem:
      • Changing dataset distributions.
      • Let models avoid the bias.
      • Train LMs on larger data.
  8. Topic: Influence of samples, understanding the datasets (March 16, 2021) slides

    • Perturbing the samples
    • Influence functions
    • Anchors, features, and adversarial examples
    • Studying the datasets