We are pleased to announce that we have 3 papers accepted to The Sixth Arabic Natural Language Processing Workshop (WANLP 2021) co-located with EACL 2021. Authored by our talented team members: Tarek Naous, Wissam Antoun, Reem Mahmoud, Fady Baly under the supervision of Prof. Hazem Hajj. The papers target Arabic empathetic conversational agents, generative language models, and language understanding models.
Empathetic BERT2BERT Conversational Model:
Learning Arabic Language Generation with Little Data
Our latest contribution to Arabic Conversational AI leverages knowledge transfer from AraBERT in a BERT2BERT architecture. We address the low resource challenges and achieve sota results in open domain empathetic response generation.
Paper: https://arxiv.org/abs/2103.04353
Abstract: Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language models such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.
AraGPT2:
Pre-Trained Transformer for Arabic Language Generation
AraGPT2 is a 1.5B transformer model, the largest for Arabic, trained on 77GB of text for 9 days with a TPUv3-128. The model can generate news articles that are difficult to distinguish from human-written articles. AraGPT2 shows impressive Zero-shot performance on trivia QA.
Paper: arxiv.org/abs/2012.15520
GitHub: https://github.com/aub-mind/arabert/tree/master/aragpt2
Abstract: Recently, pre-trained transformer-based architectures have proven to be very efficient at language modeling and understanding, given that they are trained on a large enough corpus. Applications in language generation for Arabic are still lagging in comparison to other NLP advances primarily due to the lack of advanced Arabic language generation models. In this paper, we develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on a large Arabic corpus of internet text and news articles. Our largest model, AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabic language model available. The Mega model was evaluated and showed success on different tasks including synthetic news generation, and zero-shot question answering. For text generation, our best model achieves a perplexity of 29.8 on held-out Wikipedia articles. A study conducted with human evaluators showed the significant success of AraGPT2-mega in generating news articles that are difficult to distinguish from articles written by humans. We thus develop and release an automatic discriminator model with a 98% percent accuracy in detecting model-generated text. The models are also publicly available, hoping to encourage new research directions and applications for Arabic NLP.
AraELECTRA:
Pre-Training Text Discriminators for Arabic Language Understanding
AraELECTRA is our latest advancements in Arabic Language Understanding. The model was trained on 77GB of Arabic text for 24 days. AraELECTRA achieves impressive performance, especially on Question Answering tasks.
Paper: https://arxiv.org/abs/2012.15516
Github: https://github.com/aub-mind/arabert/tree/master/araelectra
Abstract: Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network. On the other hand, current Arabic language representation approaches rely only on pretraining via masked language modeling. In this paper, we develop an Arabic language representation model, which we name AraELECTRA. Our model is pretrained using the replaced token detection objective on large Arabic text corpora. We evaluate our model on multiple Arabic NLP tasks, including reading comprehension, sentiment analysis, and named-entity recognition and we show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.
Acknowledgments:
This research was supported by the University Research Board (URB) at the American University of Beirut (AUB), and by the TFRC program, which we thank for the free access to cloud TPUs. We also thank As-Safir newspaper for the data access.
Play with Dr Driving Hack Mod APK. Definitely worth checking out for a unique driving experience.
I am a new user of this site so here i saw multiple articles and posts posted by this site,I curious more interest in some of them hope you will give more information on this topics in your next articles. Reiki
I would also motivate just about every person to save this web page for any favorite assistance to assist posted the appearance. Prijs lucht lucht warmtepomp
I really thank you for the valuable info on this great subject and look forward to more great posts Prijs lucht-water warmtepomp
I was taking a gander at some of your posts on this site and I consider this site is truly informational! Keep setting up.. seo specialist
Really appreciate this wonderful post that you have provided for us.Great site and a great topic as well i really get amazed to read this. Its really good. Batterij voor zonnepanelen
I enjoy it for creating the details, keep up the truly amazing perform continuing Thuisbatterij met zonnepanelen
I really appreciate this wonderful post that you have provided for us. I assure this would be beneficial for most of the people. Nieuwe website laten maken
Took me time to read all the comments, but I really enjoyed the article. It proved to be Very helpful to me and I am sure to all the commenters here! It’s always nice when you can not only be informed, but also entertained! Zonnepanelen thuisbatterij
Really appreciate this wonderful post that you have provided for us.Great site and a great topic as well i really get amazed to read this. Its really good. Aanleg zwembad
I enjoy it for creating the details, keep up the truly amazing perform continuing Prijs monoblok zwembad
I really appreciate this wonderful post that you have provided for us. I assure this would be beneficial for most of the people. Webdesign website
I recently came across your article and have been reading along. I want to express my admiration of your writing skill and ability to make readers read from the beginning to the end. I would like to read newer posts and to share my thoughts with you. SEO lead
Nice information, valuable and excellent design, as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which I need, thanks to offer such a helpful information here. SEO leads
Very nice article. I enjoyed reading your post. very nice share. I want to twit this to my followers. Thanks !. Online marketing
This is a wonderful article, Given so much info in it, These type of articles keeps the users interest in the website, and keep on sharing more … good luck. Airco installatie
This post, in my opinion, is both interesting and helpful. Therefore, I would want to thank you for the time and effort you invested in writing this essay. This is a really good and helpful post. fairfax gun lawyer A knowledgeable firearms attorney in Fairfax will offer a calculated defense and have a solid grasp of the law. The attorney’s familiarity with the local legal system and their experience in Fairfax County courts. Knowledge of local judges, prosecutors, and court processes can significantly impact the result.
Thank you for simplifying complex topics so effectively. หวย ยี่ กี ย้อน หลัง
Descărcarea B9Game este un proces simplu și direct. Urmați pașii de mai jos pentru a începe.
To keep things fresh, B9Game regularly adds new games and updates existing titles. This ensures that players always have access to the latest features and content. https://b9game.pro/
Get the perfect single-leg safety lanyard for your climbing adventures. Order today at TrekkersPK and climb with peace of mind! Leg Safety Lanyard
Joikushop offers the ultimate WiFi tethering software for mobile phones. Turn your phone into a portable hotspot and enjoy seamless internet sharing with any device. It’s fast, reliable, and user-friendly. Perfect for work or travel! 🌐 Try it today at https://www.joikushop.com/
Figma Resource simplifies the way designers present their work. The variety of mockups ensures projects look professional and visually appealing. From device screens to branding materials, it’s a must-have resource that adds value to any design toolkit. https://figmaresource.com
Thanks for keeping the content on your website updated and relevant. ดู ผล หวย ฮานอย
Your article is a shining example of the power of well-crafted writing.ใลน์หวยบี
Enhance your property with Tampa Pressure Washing LLC! Their expert roof cleaning and paver sealing services complement their thorough pressure washing. Tampa Bay homeowners trust them for outstanding results. Professional techniques for cleaning roofs
Betflik45 สล็อตออนไลน์ทันสมัยที่สุดและใช้งานง่าย รองรับการฝากถอนเงินอัตโนมัติโดยไม่มีขั้นต่ำ รวมถึงการฝากถอนผ่านวอเลท ที่นี่เปิดโอกาสให้กับนักเดิมพันทุกประเภทและทุกวัย คุณจะพบกับความบันเทิงและโอกาสในการทำกำไรที่ง่ายดาย