• No results found

Various expansions and variants of our work might be proposed in the future. The first might be the transferability of our work on domain adaptation to tasks other than event detection and other forms of adaptation. It is also possible to do the experiments under different settings (i.e another transformer, other domains, etc) for domain adaptation which might be investigated in a related field of study. The suggested approach may be used for further domain-shifted activities. Working on the adaption of data distributions would be a highly valuable approach. Real-world models are frequently utilized in out-of-domain scenarios, which leads to loss, even though the great majority of deep learning systems are trained and assessed on a specified data distribution. The dataset being utilized models this scenario but differs greatly. The value of the unsupervised domain adaptation research would be confirmed if data were provided to assess the applicability of the suggested strategy to actual industrial settings.

The focus of this study is on domain adaptation in an unsupervised situation with a single source and single target, where labeled data are provided for the source domain and unlabeled data for the target domain. In the future, we intend to extend our study to more contexts. Consideration may be given to a multi-domain environment where data are accessible from several sources. A variation known as ”multi-source domain adaptation” allows for the labeled data to be gathered from many sources with various distributions. The lack of target-labeled data is a typical issue in UDA. Annotating the data and augmentation would help to create more reliable data to test the models. In addition, we plan to extend our work to use the model under other domains as well as using other transformers that demonstrated promising performance such as ELECTRA[54] and XLNet[23]. The other thing that might be interesting to follow is prompt engineering[55] for fine-tuning purposes in a zero-shot or few-shot setting. Using such approaches could be possible with generative models for UDA[56].

Bibliography

[1] A. Gliozzo and C. Strapparava, Semantic Domains in Computational Linguistics. Humanities, Social Science and Law, Springer Berlin Heidelberg, 2009.

[2] P. Hanks, “Do word meanings exist?,” Computers and the Humanities, vol. 34, no. 1/2, pp. 205–

215, 2000.

[3] S. Sridharan and B. Murphy, “Modeling word meaning: Distributional semantics and the corpus quality-quantity trade-off,” in Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon, (Mumbai, India), pp. 53–68, The COLING 2012 Organizing Committee, Dec. 2012.

[4] Z. S. Harris, “Distributional structure,” Word, vol. 10, no. 2-3, pp. 146–162, 1954.

[5] B. Nerlich, Z. Todd, V. Herman, and D. D. Clarke, Polysemy: Flexible patterns of meaning in mind and language, vol. 142. Walter de Gruyter, 2011.

[6] H. Daum´e, “Frustratingly easy domain adaptation,” 2009.

[7] J. Jiang and C. Zhai, “Instance weighting for domain adaptation in NLP,” in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, (Prague, Czech Repub-lic), pp. 264–271, Association for Computational Linguistics, June 2007.

[8] A. Axelrod, X. He, and J. Gao, “Domain adaptation via pseudo in-domain data selection,”

pp. 355–362, 01 2011.

[9] Y. Tsvetkov, M. Faruqui, W. Ling, B. MacWhinney, and C. Dyer, “Learning the curriculum with Bayesian optimization for task-specific word representation learning,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (Berlin, Germany), pp. 130–139, Association for Computational Linguistics, Aug. 2016.

[10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.

[11] J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation,”

in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Process-ing (EMNLP), (Doha, Qatar), pp. 1532–1543, Association for Computational LProcess-inguistics, Oct.

2014.

[12] P. Kameswara Sarma, “Learning word embeddings for data sparse and sentiment rich data sets,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, (New Orleans, Louisiana, USA), pp. 46–53, Association for Computational Linguistics, June 2018.

[13] H. HOTELLING, “RELATIONS BETWEEN TWO SETS OF VARIATES*,” Biometrika, vol. 28, pp. 321–377, 12 1936.

[14] D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor, “Canonical correlation analysis: An overview with application to learning methods,” Neural Computation, vol. 16, no. 12, pp. 2639–2664, 2004.

[15] A. Conneau, D. Kiela, H. Schwenk, L. Barrault, and A. Bordes, “Supervised learning of univer-sal sentence representations from natural language inference data,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, (Copenhagen, Denmark), pp. 670–680, Association for Computational Linguistics, Sept. 2017.

[16] D. Bollegala, T. Maehara, and K.-i. Kawarabayashi, “Unsupervised cross-domain word repre-sentation learning,” in Proceedings of the 53rd Annual Meeting of the Association for Computa-tional Linguistics and the 7th InternaComputa-tional Joint Conference on Natural Language Processing (Volume 1: Long Papers), (Beijing, China), pp. 730–740, Association for Computational Lin-guistics, July 2015.

[17] J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” in Pro-ceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (Melbourne, Australia), pp. 328–339, Association for Computational Linguis-tics, July 2018.

[18] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North Amer-ican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), (New Orleans, Louisiana), pp. 2227–2237, Association for Computa-tional Linguistics, June 2018.

[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polo-sukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.

[20] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, et al., “Improving language understand-ing by generative pre-trainunderstand-ing,” 2018.

[21] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.

[22] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu, et al., “Exploring the limits of transfer learning with a unified text-to-text transformer.,” J. Mach.

Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.

[23] Z. Yang, Z. Dai, Y. Yang, J. G. Carbonell, R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” CoRR, vol. abs/1906.08237, 2019.

[24] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Tech-nologies, Volume 1 (Long and Short Papers), (Minneapolis, Minnesota), pp. 4171–4186, Asso-ciation for Computational Linguistics, June 2019.

[25] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.

[26] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter,” arXiv preprint arXiv:1910.01108, 2019.

[27] A. Radford and K. Narasimhan, “Improving language understanding by generative pre-training,”

2018.

[28] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.

[29] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.

[30] T.-H. Lin, T.-C. Chi, and A. Rumshisky, “On task-adaptive pretraining for dialogue response selection,” 2022.

[31] J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” in Pro-ceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (Melbourne, Australia), pp. 328–339, Association for Computational Linguis-tics, July 2018.

[32] S. Gururangan, A. Marasovi´c, S. Swayamdipta, K. Lo, I. Beltagy, D. Downey, and N. A.

Smith, “Don’t stop pretraining: adapt language models to domains and tasks,” arXiv preprint arXiv:2004.10964, 2020.

[33] J. Blitzer, R. McDonald, and F. Pereira, “Domain adaptation with structural correspondence learning,” in Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, (Sydney, Australia), pp. 120–128, Association for Computational Linguistics, July 2006.

[34] Y. Ziser and R. Reichart, “Pivot based language modeling for improved neural domain adapta-tion,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), (New Orleans, Louisiana), pp. 1241–1251, Association for Computational Linguistics, June 2018.

[35] “Uda for event detection using domain-specific adapters,”

[36] T. Chen, S. Huang, F. Wei, and J. Li, “Pseudo-label guided unsupervised domain adaptation of contextual embeddings,” in Proceedings of the Second Workshop on Domain Adaptation for NLP, (Kyiv, Ukraine), pp. 9–15, Association for Computational Linguistics, Apr. 2021.

[37] T. O’Gorman, K. Wright-Bettner, and M. Palmer, “Richer event description: Integrating event coreference with temporal, causal and bridging annotation,” in Proceedings of the 2nd Workshop on Computing News Storylines (CNS 2016), pp. 47–56, 2016.

[38] A. van Cranenburgh, “Annotation and prediction of movie sentiment arcs,” 2022. Computational Stylistics Workshop on Emotion and Sentiment Analysis in Literature ; Conference date: 16-06-2022 Through 17-06-16-06-2022.

[39] K. Krishna, S. Garg, J. P. Bigham, and Z. C. Lipton, “Downstream datasets make surprisingly good pretraining corpora,” arXiv preprint arXiv:2209.14389, 2022.

[40] J. Lin, “Divergence measures based on the shannon entropy,” IEEE Transactions on Information theory, vol. 37, no. 1, pp. 145–151, 1991.

[41] E. A. Pechenick, C. M. Danforth, and P. S. Dodds, “Characterizing the google books corpus:

Strong limits to inferences of socio-cultural and linguistic evolution,” PloS one, vol. 10, no. 10, p. e0137041, 2015.

[42] R. J. Gallagher, A. J. Reagan, C. M. Danforth, and P. S. Dodds, “Divergent discourse between protests and counter-protests:# blacklivesmatter and# alllivesmatter,” PloS one, vol. 13, no. 4, p. e0195644, 2018.

[43] J. Lu, M. Henchion, and B. Mac Namee, “Diverging divergences: Examining variants of Jensen Shannon divergence for corpus comparison tasks,” in Proceedings of the Twelfth Language Re-sources and Evaluation Conference, (Marseille, France), pp. 6740–6744, European Language Resources Association, May 2020.

[44] J. Lafferty, A. McCallum, and F. C. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” 2001.

[45] C. Manning and H. Schutze, Foundations of statistical natural language processing. MIT press, 1999.

[46] A. Baevski, S. Edunov, Y. Liu, L. Zettlemoyer, and M. Auli, “Cloze-driven pretraining of self-attention networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Process-ing (EMNLP-IJCNLP), (Hong Kong, China), pp. 5360–5369, Association for Computational Linguistics, Nov. 2019.

[47] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomed-ical language representation model for biomedbiomed-ical text mining,” CoRR, vol. abs/1901.08746, 2019.

[48] S. Wang, M. Khabsa, and H. Ma, “To pretrain or not to pretrain: Examining the benefits of pretrainng on resource rich tasks,” in Proceedings of the 58th Annual Meeting of the Associ-ation for ComputAssoci-ational Linguistics, (Online), pp. 2209–2213, AssociAssoci-ation for ComputAssoci-ational Linguistics, July 2020.

[49] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Te-jani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems(H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, and R. Garnett, eds.), vol. 32, Curran Associates, Inc., 2019.

[50] H. Xu, B. Liu, L. Shu, and P. Yu, “BERT post-training for review reading comprehension and aspect-based sentiment analysis,” in Proceedings of the 2019 Conference of the North Ameri-can Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), (Minneapolis, Minnesota), pp. 2324–2335, Association for Computational Linguistics, June 2019.

[51] C. Sun, X. Qiu, Y. Xu, and X. Huang, “How to fine-tune bert for text classification?,” in China national conference on Chinese computational linguistics, pp. 194–206, Springer, 2019.

GERELATEERDE DOCUMENTEN