Hugging Face



Hugging Face


Transformersis an opinionated library built for NLP researchers seeking to use/study/extend large-scale transformers models.

The library was designed with two strong goals in mind:

  • Be as easy and fast to use as possible:

    • we strongly limited the number of user-facing abstractions to learn, in fact, there are almost no abstractions, just three standard classes required to use each model: configuration, models and tokenizer,

    • all of these classes can be initialized in a simple and unified way from pretrained instances by using a common from_pretrained() instantiation method which will take care of downloading (if needed), caching and loading the related class from a pretrained instance supplied in the library or your own saved instance.

    • as a consequence, this library is NOT a modular toolbox of building blocks for neural nets. If you want to extend/build-upon the library, just use regular Python/PyTorch modules and inherit from the base classes of the library to reuse functionalities like model loading/saving.

  • Provide state-of-the-art models with performances as close as possible to the original models:

    • we provide at least one example for each architecture which reproduces a result provided by the official authors of said architecture,
    • the code is usually as close to the original code base as possible which means some PyTorch code may be not as pytorchic as it could be as a result of being converted TensorFlow code.

Three types of classes for each model

  • Model classes e.g., BertModel which are 20+ PyTorch models (torch.nn.Modules) that work with the pretrained weights provided in the library. In TF2, these are tf.keras.Model.

  • Configuration classes which store all the parameters required to build a model, e.g., BertConfig. You don’t always need to instantiate these your-self. In particular, if you are using a pretrained model without any modification, creating the model will automatically take care of instantiating the configuration (which is part of the model)

  • Tokenizer classes which store the vocabulary for each model and provide methods for encoding/decoding strings in a list of token embeddings indices to be fed to a model, e.g., BertTokenizer

All these classes can be instantiated from pretrained instances and saved locally using two methods:

  • from_pretrained() let you instantiate a model/configuration/tokenizer from a pretrained version either provided by the library itself (currently 27 models are provided as listed here) or stored locally (or on a server) by the user,

  • save_pretrained() let you save a model/configuration/tokenizer locally so that it can be reloaded using from_pretrained().