SDCA4CRF: Adaptive Stochastic Dual Coordinate Ascent for training Conditional Random Fields.

drawing

People

Abstract

This work investigates the training of conditional random fields (CRFs) via the stochastic dual coordinate ascent (SDCA) algorithm of Shalev-Shwartz and Zhang (2016). SDCA enjoys a linear convergence rate and a strong empirical performance for binary classification problems. However, it has never been used to train CRFs. Yet it benefits from an “exact” line search with a single marginalization oracle call, unlike previous approaches. In this paper, we adapt SDCA to train CRFs, and we enhance it with an adaptive non-uniform sampling strategy based on block duality gaps. We perform experiments on four standard sequence prediction tasks. SDCA demonstrates performances on par with the state of the art, and improves over it on three of the four datasets, which have in common the use of sparse features.

Paper

UAI 2018 (to be published).
Presented at the NIPS workshop OPT2017 (collaboration with Ahmed Touati) Article

BibTeX

@inproceedings{sdca4crf2018lepriol,
      author    = {Le Priol, R\'emi and Pich\'e, Alexandre and Lacoste-Julien, Simon},
      title     = {Adaptive Stochastic Dual Coordinate Ascent for training Conditional Random Fields},
      booktitle = {UAI},
      year      = {2018}
}

Reproducibility Code

We developped a Python-Numpy implementation of SDCA for training chain-structured conditional random field models with L2-regularization. SDCA fares against optimization a wide class of algorithms developped by Mark Schmidt in the SAG4CRF project.

We reuse the experimental setup from SAG4CRF, with the same datasets and features. We release pre-processed features (courtesy of Mark Schmidt) with the code so as to reproduce training. Original datasets can be found here:

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright.