We are pleased to announce that we have 3 papers accepted to The Sixth Arabic Natural Language Processing Workshop (WANLP 2021) co-located with EACL 2021. Authored by our talented team members: Tarek Naous, Wissam Antoun, Reem Mahmoud, Fady Baly under the supervision of Prof. Hazem Hajj. The papers target Arabic empathetic conversational agents, generative language models, and language understanding models.
Empathetic BERT2BERT Conversational Model:
Learning Arabic Language Generation with Little Data
Our latest contribution to Arabic Conversational AI leverages knowledge transfer from AraBERT in a BERT2BERT architecture. We address the low resource challenges and achieve sota results in open domain empathetic response generation.
Paper: https://arxiv.org/abs/2103.04353
Abstract: Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language models such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.
AraGPT2:
Pre-Trained Transformer for Arabic Language Generation
AraGPT2 is a 1.5B transformer model, the largest for Arabic, trained on 77GB of text for 9 days with a TPUv3-128. The model can generate news articles that are difficult to distinguish from human-written articles. AraGPT2 shows impressive Zero-shot performance on trivia QA.
Paper: arxiv.org/abs/2012.15520
GitHub: https://github.com/aub-mind/arabert/tree/master/aragpt2
Abstract: Recently, pre-trained transformer-based architectures have proven to be very efficient at language modeling and understanding, given that they are trained on a large enough corpus. Applications in language generation for Arabic are still lagging in comparison to other NLP advances primarily due to the lack of advanced Arabic language generation models. In this paper, we develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on a large Arabic corpus of internet text and news articles. Our largest model, AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabic language model available. The Mega model was evaluated and showed success on different tasks including synthetic news generation, and zero-shot question answering. For text generation, our best model achieves a perplexity of 29.8 on held-out Wikipedia articles. A study conducted with human evaluators showed the significant success of AraGPT2-mega in generating news articles that are difficult to distinguish from articles written by humans. We thus develop and release an automatic discriminator model with a 98% percent accuracy in detecting model-generated text. The models are also publicly available, hoping to encourage new research directions and applications for Arabic NLP.
AraELECTRA:
Pre-Training Text Discriminators for Arabic Language Understanding
AraELECTRA is our latest advancements in Arabic Language Understanding. The model was trained on 77GB of Arabic text for 24 days. AraELECTRA achieves impressive performance, especially on Question Answering tasks.
Paper: https://arxiv.org/abs/2012.15516
Github: https://github.com/aub-mind/arabert/tree/master/araelectra
Abstract: Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network. On the other hand, current Arabic language representation approaches rely only on pretraining via masked language modeling. In this paper, we develop an Arabic language representation model, which we name AraELECTRA. Our model is pretrained using the replaced token detection objective on large Arabic text corpora. We evaluate our model on multiple Arabic NLP tasks, including reading comprehension, sentiment analysis, and named-entity recognition and we show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.
Acknowledgments:
This research was supported by the University Research Board (URB) at the American University of Beirut (AUB), and by the TFRC program, which we thank for the free access to cloud TPUs. We also thank As-Safir newspaper for the data access.
The skill to act promptly when opportunities emerge. hit club
You will not succeed in any attempt you do not make. zowin
Failure only exists when you cease to make an effort. 11bet
Successful individuals embrace the possibility of failure. bet88 shzhen
Thank you for sharing this informative update about WANLP 2021! The advancements in Arabic empathetic conversational agents and language models are truly impressive and highlight the ongoing innovation in NLP. It’s exciting to see such valuable research emerging in this field. For those looking to enhance their visibility and outreach in discussions like these, buy YouTube views can be a great strategy to ensure your insights reach a wider audience.
Thank you for this insightful post on Arabic NLP advancements and empathetic conversational agents. It’s impressive to see how your team is addressing challenges in language generation. Speaking of fresh ideas, I believe humor can uplift any conversation, just like 137 Airy Oxygen Puns!
Your blog provided us with valuable information to work with. Each & every tips of your post are awesome. Thanks a lot for sharing. Keep blogging, local seo package
Thanks for a wonderful share. Your article has proved your hard work and experience you have got in this field. Brilliant .i love it reading. Casual games by BDG
If you’re searching for a trusted place to download the best games and apps, https://modapkscentre.com/ is the site to visit! Whether you want to play exciting games like Shadow Fight or use popular apps like YouTube, you’ll find a great selection here. The site is easy to use, regularly updated, and offers safe downloads for all your favorite apps and games.
Looking for a new way to stream movies and TV shows without paying a subscription? Freecine APK is the solution you’ve been waiting for! With a vast range of content, from the latest releases to timeless classics, you can enjoy endless entertainment for free. Plus, you can download your favorite shows to watch offline anytime. Discover more here: https://baixarfreecineapk.com
Most of the time I don’t make comments on websites, but I’d like to say that this article really forced me to do so. Really nice post! 스포츠중계
I am thankful for the passion and enthusiasm you bring to your writing. D88
Your article is a testament to the depth of your expertise and the breadth of your knowledge.
keno minh chính
I enjoy each one of the posts, We appreciated, I would enjoy a lot more info with this particular, due to the fact it’s very enjoyable., Be thankful meant for providing. 온라인카지노
good post with lots of information. keeping a bawdy house virginia Sex crime lawyers emphasize the importance of seeking legal counsel promptly for complex cases involving DNA evidence, victim interviews, and intricate investigations.
I would like to thank you for the efforts youve got produced in writing this post. I am hoping the exact same most effective function from you within the long term too. In fact your creative writing abilities has inspired me to start my very own BlogEngine blog now. 바나나오피
Good day, May I download the photograph and employ it on my personal web site? 광주출장
Its as if you had a great grasp on the subject matter, but you forgot to include your readers. Perhaps you should think about this from more than one angle. Zonnepanelen in Zaventem
I really appreciate this wonderful post that you have provided for us. I assure this would be beneficial for most of the people. Zonnepanelen Sint-Pieters-Woluwe
Thanks for providing recent updates regarding the concern, I look forward to read more. Zonnepanelen Sint-Pieters-Woluwe
Hi! This is my first visit to your blog! We are a team of volunteers and new initiatives in the same niche. Blog gave us useful information to work. You have done an amazing job! Zonnepaneel instalateur Diest
I can set up my new idea from this post. It gives in depth information. Thanks for this valuable information for all,.. Zonnepanelen in Leuven
I have express a few of the articles on your website now, and I really like your style of blogging. I added it to my favorite’s blog site list and will be checking back soon… Zonnepanelen installateur Hasselt
Excellent Blog! I would like to thank for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well. I wanted to thank you for this websites! Thanks for sharing. Great websites! Zonnepanelen in Lubbeek
I will be interested in more similar topics. i see you got really very useful topics , i will be always checking your blog thanks Zonnepanelen in Lubbeek
Just saying thanks will not just be sufficient, for the fantasti c lucidity in your writing. I will instantly grab your rss feed to stay informed of any updates. Zonnepanelen in Leuven
Actually I read it yesterday but I had some thoughts about it and today I wanted to read it again because it is very well written. Zonnepaneel installateur Hasselt
Thanks for such a great post and the review, I am totally impressed! Keep stuff like this coming. Zonnepanelen in Asse
A good blog always comes-up with new and exciting information and while reading I have feel that this blog is really have all those quality that qualify a blog to be a one. Zonnepanelen in Lubbeek
This post is good enough to make somebody understand this amazing thing, and I’m sure everyone will appreciate this interesting things. Zonnepanelen installateur Bierbeek
Regular visits listed here are the easiest method to appreciate your energy, which is why why I am going to the website everyday, searching for new, interesting info. Many, thank you! Zonnepanelen installatie
This is a wonderful article, Given so much info in it, These type of articles keeps the users interest in the website, and keep on sharing more … good luck. Spouwmuurisolatie prijs
This was among the best posts and episode from your team it let me learn many new things. faceapp mod apk
Your writing has inspired me to delve deeper into the subject matter ซอง หวย ฮานอย
Its a great article. Download TIKTOK18 to stay tuned with all the latest news. tiktoc18.app
Fascinating research! The development of Arabic empathetic conversational agents and pre-trained language models presented at WANLP 2021 has significant implications for natural language processing in the Arabic-speaking world. How do you envision these models being integrated into real-world applications, such as chatbots or virtual assistants. Want to know how sexual defense attorney works.
Fantastic share! Are you an Instagram user seeking a way to view private instagram profiles without sending those awkward follow-up requests? If so, I recommend using the instalooker tool. Within just a few clicks, it can give you access to private instagram account photos, stories, and more. Check out the linked site for more details.
Discover the best AI undressing tools for 2024, including both free and premium options. Compare top Free Undress AI tools and select the perfect one for your needs!
Learning is not the act of pouring knowledge into good. good88.party
Your blog provided us with valuable information to work with. Each & every tips of your post are awesome. Thanks a lot for sharing. Keep blogging, Global Bulletin
The 10 Year Golden Visa in the UAE provides long-term residency for investors, entrepreneurs, and exceptional talents. This visa allows holders to live, work, and study without a local sponsor, fostering stability and growth. By attracting diverse talent, the 10-Year Golden Visa enhances the UAE’s innovative economy and vibrant multicultural community.
Thanks for taking the time to discuss this เว็บหวยลาว
Your article has sparked a conversation that I believe is important and necessary.สมัคร แทง หวย บี
Lit neon light signs add a captivating glow to any setting, enhancing both ambiance and style. Perfect for businesses, events, or home decor, these signs come in various designs and colors, providing a customizable way to express personality and creativity.
You do not have to be great to start. 79king
Using CapCut ’s presets, color grading, and reusable templates, creators can establish a distinct style, enhancing brand recall. Viewers are more likely to follow channels that have a recognizable and cohesive aesthetic.
But you have to start to be great. 79kingg club
Online therapy offers a unique comfort—you get to chat from your own space! I feel safe and secure, which makes opening up easier. Highly recommend for anyone looking to prioritize their mental health! Mithila Ballal Online Therapy Services
However, you must keep moving forward about randm tornado 15000 gummy bear.
You can employ to transform the globe. llucky88.site