4. Referencias#

1

Montserrat Marimon, Aitor Gonzalez-Agirre, Ander Intxaurrondo, Heidy Rodriguez, Jose Lopez Martin, Marta Villegas, and Martin Krallinger. Automatic de-identification of medical texts in spanish: the meddocan track, corpus, guidelines, methods and evaluation of results. In IberLEF@SEPLN. 2019.

2

Amber Stubbs and Özlem Uzuner. Annotating longitudinal clinical narratives for de-identification: the 2014 i2b2/UTHealth corpus. J. Biomed. Inform., 58 Suppl:S20–S29, December 2015.

3

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: pre-training of deep bidirectional transformers for language understanding. ArXiv, 2019.

4

Guillaume Lample and Alexis Conneau. Cross-lingual language model pretraining. In NeurIPS. 2019.

5

Stefan Schweter and A. Akbik. Flert: document-level features for named entity recognition. ArXiv, 2020.

6

A. Akbik, Tanja Bergmann, Duncan A. J. Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf. Flair: an easy-to-use framework for state-of-the-art nlp. In NAACL. 2019.

7

Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional lstm-crf models for sequence tagging. ArXiv, 2015.

8

A. Akbik, Duncan A. J. Blythe, and Roland Vollgraf. Contextual string embeddings for sequence labeling. In COLING. 2018.

9

Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In ICLR. 2013.

10

Guillaume Wenzek, Marie-Anne Lachaux, Alexis Conneau, Vishrav Chaudhary, Francisco Guzm'an, Armand Joulin, and Edouard Grave. Ccnet: extracting high quality monolingual datasets from web crawl data. In LREC. 2020.

11

José Cañete, Gabriel Chaperon, Rodrigo Fuentes, Jou-Hui Ho, Hojin Kang, and Jorge Pérez. Spanish pre-trained bert model and evaluation data. In PML4DC at ICLR 2020. 2020.

12

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.

13

Maria Antoniak and David Mimno. Evaluating the stability of embedding-based word similarities. Transactions of the Association for Computational Linguistics, 6:107–119, 2018.

14

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. In NAACL. 2018.

15

Lukas Lange, Heike Adel, and Jannik Strötgen. Nlnde: the neither-language-nor-domain-experts' way of spanish medical document de-identification. In IberLEF@SEPLN. 2019.

16

Erik Tjong Kim Sang and Fien De Meulder. Introduction to the conll-2003 shared task: language-independent named entity recognition. In CoNLL. 2003.

17

Alec Radford, Rafal Józefowicz, and Ilya Sutskever. Learning to generate reviews and discovering sentiment. ArXiv, 2017.

18

Yukun Zhu, Ryan Kiros, Richard S. Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. 2015 IEEE International Conference on Computer Vision (ICCV), pages 19–27, 2015.

19

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In ICLR. 2019.

20

Leslie N. Smith. A disciplined approach to neural network hyper-parameters: part 1 - learning rate, batch size, momentum, and weight decay. ArXiv, 2018.

21

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, and Jamie Brew. Huggingface's transformers: state-of-the-art natural language processing. ArXiv, 2019.

22

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: a robustly optimized bert pretraining approach. ArXiv, 2019.

23

Taku Kudo and John Richardson. Sentencepiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In EMNLP. 2018.