We are pleased to announce that we have 3 papers accepted to The Sixth Arabic Natural Language Processing Workshop (WANLP 2021) co-located with EACL 2021. Authored by our talented team members: Tarek Naous, Wissam Antoun, Reem Mahmoud, Fady Baly under the supervision of Prof. Hazem Hajj. The papers target Arabic empathetic conversational agents, generative language models, and language understanding models.
Empathetic BERT2BERT Conversational Model:
Learning Arabic Language Generation with Little Data
Our latest contribution to Arabic Conversational AI leverages knowledge transfer from AraBERT in a BERT2BERT architecture. We address the low resource challenges and achieve sota results in open domain empathetic response generation.
Paper: https://arxiv.org/abs/2103.04353
Abstract: Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language models such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.
AraGPT2:
Pre-Trained Transformer for Arabic Language Generation
AraGPT2 is a 1.5B transformer model, the largest for Arabic, trained on 77GB of text for 9 days with a TPUv3-128. The model can generate news articles that are difficult to distinguish from human-written articles. AraGPT2 shows impressive Zero-shot performance on trivia QA.
Paper: arxiv.org/abs/2012.15520
GitHub: https://github.com/aub-mind/arabert/tree/master/aragpt2
Abstract: Recently, pre-trained transformer-based architectures have proven to be very efficient at language modeling and understanding, given that they are trained on a large enough corpus. Applications in language generation for Arabic are still lagging in comparison to other NLP advances primarily due to the lack of advanced Arabic language generation models. In this paper, we develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on a large Arabic corpus of internet text and news articles. Our largest model, AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabic language model available. The Mega model was evaluated and showed success on different tasks including synthetic news generation, and zero-shot question answering. For text generation, our best model achieves a perplexity of 29.8 on held-out Wikipedia articles. A study conducted with human evaluators showed the significant success of AraGPT2-mega in generating news articles that are difficult to distinguish from articles written by humans. We thus develop and release an automatic discriminator model with a 98% percent accuracy in detecting model-generated text. The models are also publicly available, hoping to encourage new research directions and applications for Arabic NLP.
AraELECTRA:
Pre-Training Text Discriminators for Arabic Language Understanding
AraELECTRA is our latest advancements in Arabic Language Understanding. The model was trained on 77GB of Arabic text for 24 days. AraELECTRA achieves impressive performance, especially on Question Answering tasks.
Paper: https://arxiv.org/abs/2012.15516
Github: https://github.com/aub-mind/arabert/tree/master/araelectra
Abstract: Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network. On the other hand, current Arabic language representation approaches rely only on pretraining via masked language modeling. In this paper, we develop an Arabic language representation model, which we name AraELECTRA. Our model is pretrained using the replaced token detection objective on large Arabic text corpora. We evaluate our model on multiple Arabic NLP tasks, including reading comprehension, sentiment analysis, and named-entity recognition and we show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.
Acknowledgments:
This research was supported by the University Research Board (URB) at the American University of Beirut (AUB), and by the TFRC program, which we thank for the free access to cloud TPUs. We also thank As-Safir newspaper for the data access.
“CapCut Mod APK se video editing ka maza do guna ho gaya hai! Aap bhi Visit CapCut Premium APK par jaake is amazing tool ko download karein. No watermark aur advanced features ke sath, aapki videos sabse alag dikhengi!”
As it support provider, Labyrinth Technology offer comprehensive managed IT services with a proactive approach, ensuring continuous monitoring, regular maintenance, and robust security measures to prevent issues before they impact your business.
Check it out: IT support London
A clean hand needs no washing. abc8
The failure is the mother of success. 98win
The die is cast good and sunny. 33Win
Never say everything that you know. 69VN
Never believe everything that you hear. 8day
You must go through the rain if you want to see a rainbow.. 99ok.com
What a fabulous post this has been. Never seen this kind of useful post. I am grateful to you and expect more number of posts like these. Thank you very much ADP Vantage Login Portal
Your article not only provided valuable information but also sparked introspection and reflection on my part.
สมัคร huaybee
Do not dwell on dreams and ignore life. Go99
Nice Informative Blog having nice sharing.. Free Excel templates for bookkeeping
Your article brings many fresh and intriguing perspectives เข้าเล่นหวยลาวพัฒนา
Your blog provided us with valuable information to work with. Each & every tips of your post are awesome. Thanks a lot for sharing. Keep blogging, coyotehunting.org
The person who always follows the crowd will not go further than the crowd. U888
Find mental health CE courses that suit your needs. mental health ce
The trick in life is learning how to solve it. I9bet
No one can make you feel minor without your permission. 88CLB
This information is really helpful for who really needs this. I hope you will many more write post like this. This was a nice blog. I have gone through all the websites and posted the comment which matches my niche.jetnet aa
Do not wait for something perfect, take this moment and make it perfect. 77BET
Thanks for the sharing this kind of amazing and wonderful information you have shared with us. Keep sharing this kind of knowledgeable post. LHI Provider Portal
Thanks for the sharing this kind of amazing and wonderful information you have shared with us. Keep sharing this kind of knowledgeable post. MaximTimeClock
Wow, this piece of writing is good, my younger sister is analyzing these things, thus I am going to tell her.
Here is my website … dwi and dui
Thank you for your unwavering commitment to excellence.สมัครหวย มาเล
I just found this blog and have high hopes for it to continue. Keep up the great work, its hard to find good ones. I have added to my favorites. Thank You. agenolx daftar
This type of message always inspiring and I prefer to read quality content, so happy to find good place to many here in the post, the writing is just great, thanks for the post. agenolx daftar
Congratulations to the team on your achievements at WANLP 2021! The advancements in Arabic conversational agents are exciting. On a different note, I’ve been exploring the differences between 6.5 vs 6.75 speakers for my audio setup. It’s interesting how such choices can impact performance. Keep up the great work!
Real love is something that finds you. HAY88
Dreamle is the premier AI sexting platform for limitless, engaging NSFW chatbot conversations with AI characters, offering an immersive adult chat experience.
Sometimes what is invisible to the eyes but the heart can see. Xin88
Your dedication to delivering high-quality content is evident in every word you write. game nỗ hũ đổi thưởng
A flower cannot blossom without sunshine as a man cannot live without love. E2BET
Immature love is and I love you because I need you. ko66
Congratulations on the acceptance of your papers at WANLP 2021! The advancements in Arabic conversational agents and language models are truly impressive. For those interested in design software, I recently came across the ‘Archicad 27 PT-BR download crackeado‘—it’s a great tool for architectural projects! I am excited to see how these new models will influence future research in NLP!
Congratulations on the acceptance of your papers at WANLP 2021! The advancements in Arabic conversational agents and language models are truly impressive. For anyone interested in enhancing their projects, I recently discovered ‘Top Follow Apk Unlimited Coins‘—a fantastic resource! I’m excited to see how these new models will influence future research in NLP!
If you have someone to love and share, it is your family. five88
I really love your site.. Pleasant colors & theme.
Did you make this web site yourself? Please reply back
as I’m planning to create my own personal blog and would like to know where you got this from or what the theme is named.
Thank you! eos파워볼
Very soon this website will be famous among all blogging and site-building people, due to it’s nice posts 카지노 사이트 추천
There is definately a great deal to learn about this issue.
I love all of the points you made. 카지노 사이트 추천
It’s always helpful to read through content from
other authors and practice something from other sites. 무료 토토사이트
Home is in which you act the worst but be loved the most. debet
The greatest happiness in your life is family happiness. five88
Houses are built not to look on, they are built to live in. nhatvip
Family is like music, it has some high notes. thương hiệu 188bet
I recently tried butt lift underwear, and I have to say, it’s a total game-changer! The way it enhances your curves while staying comfortable throughout the day is amazing. It’s perfect for outfits that need that extra boost to create a more defined silhouette. Plus, I love how it’s discreet under clothing, so you don’t have to worry about lines showing. Has anyone else had experience with this type of shapewear? I’d love to hear your thoughts on how it worked for you!
his is Australia’s only site that lets you find the value of land directly by Address, and works across multiple states! We currently support NSW, ACT and QLD, other states are being added soon, we are working hard on it!https://landvalue.au/search-address/
And low notes, but always a beautiful song. thương hiệu i9bet