Posts by Collection




Metric Perspective of Stochastic Optimizers


In this talk, I explain several major stochastic optimizers from the perspective of the metric, that is the definition of the parameter space of the model. This talk covers algorithms such as

  • Quasi-Newton Method Type
    • Finite-Difference Method: SGD-QN, AdaDelta, VSGD
    • Extended Gauss-Newton: KSD, SMD, HF
    • LBFGS: Stochastic LBFGS, RES
  • Natural Gradient Type: Natural Gradient, TONGA
  • Root Mean Square Type: AdaGrad, RMSProp, Adam

Relational Knowledge and Language Models


In this talk, I explain a few recent studies in the area of commonsense & relational knowledge probing of pretrained language models. Following papers are covered in this talk:

  • Petroni, et al. “Language models as knowledge bases?” 2019
  • Jiang, et al. “How can we know what language models know?” 2019
  • Bouraoui, et al. “Inducing relational knowledge from BERT.” 2019