4. Referencias#

1: Montserrat Marimon, Aitor Gonzalez-Agirre, Ander Intxaurrondo, Heidy Rodriguez, Jose Lopez Martin, Marta Villegas, and Martin Krallinger. Automatic de-identification of medical texts in spanish: the meddocan track, corpus, guidelines, methods and evaluation of results. In IberLEF@SEPLN. 2019.
2: Amber Stubbs and Özlem Uzuner. Annotating longitudinal clinical narratives for de-identification: the 2014 i2b2/UTHealth corpus. J. Biomed. Inform., 58 Suppl:S20–S29, December 2015.
3: Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: pre-training of deep bidirectional transformers for language understanding. ArXiv, 2019.
4: Guillaume Lample and Alexis Conneau. Cross-lingual language model pretraining. In NeurIPS. 2019.
5: Stefan Schweter and A. Akbik. Flert: document-level features for named entity recognition. ArXiv, 2020.
6: A. Akbik, Tanja Bergmann, Duncan A. J. Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf. Flair: an easy-to-use framework for state-of-the-art nlp. In NAACL. 2019.
7: Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional lstm-crf models for sequence tagging. ArXiv, 2015.
8: A. Akbik, Duncan A. J. Blythe, and Roland Vollgraf. Contextual string embeddings for sequence labeling. In COLING. 2018.
9: Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In ICLR. 2013.
10: Guillaume Wenzek, Marie-Anne Lachaux, Alexis Conneau, Vishrav Chaudhary, Francisco Guzm'an, Armand Joulin, and Edouard Grave. Ccnet: extracting high quality monolingual datasets from web crawl data. In LREC. 2020.
11: José Cañete, Gabriel Chaperon, Rodrigo Fuentes, Jou-Hui Ho, Hojin Kang, and Jorge Pérez. Spanish pre-trained bert model and evaluation data. In PML4DC at ICLR 2020. 2020.
12: Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.
13: Maria Antoniak and David Mimno. Evaluating the stability of embedding-based word similarities. Transactions of the Association for Computational Linguistics, 6:107–119, 2018.
14: Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. In NAACL. 2018.
15: Lukas Lange, Heike Adel, and Jannik Strötgen. Nlnde: the neither-language-nor-domain-experts' way of spanish medical document de-identification. In IberLEF@SEPLN. 2019.
16: Erik Tjong Kim Sang and Fien De Meulder. Introduction to the conll-2003 shared task: language-independent named entity recognition. In CoNLL. 2003.
17: Alec Radford, Rafal Józefowicz, and Ilya Sutskever. Learning to generate reviews and discovering sentiment. ArXiv, 2017.
18: Yukun Zhu, Ryan Kiros, Richard S. Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. 2015 IEEE International Conference on Computer Vision (ICCV), pages 19–27, 2015.
19: Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In ICLR. 2019.
20: Leslie N. Smith. A disciplined approach to neural network hyper-parameters: part 1 - learning rate, batch size, momentum, and weight decay. ArXiv, 2018.
21: Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, and Jamie Brew. Huggingface's transformers: state-of-the-art natural language processing. ArXiv, 2019.
22: Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: a robustly optimized bert pretraining approach. ArXiv, 2019.
23: Taku Kudo and John Richardson. Sentencepiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In EMNLP. 2018.

Anonimización aplicada al ámbito médico

Referencias

4. Referencias#