AraBERT : Pre-training BERT for Arabic Language Understanding

by | Feb 28, 2020 | NLP | 51,483 comments

Authors: Wissam Antoun, Fady Baly, Hazem Hajj

AraBERT is an Arabic pretrained language model based on Google’s BERT architecture. AraBERT uses the same BERT-Base config. More details are available in the AraBERT PAPER and in the AraBERT Meetup

There is two versions of the model AraBERTv0.1 and AraBERTv1, with the difference being that AraBERTv1 uses pre-segmented text where prefixes and suffixes were split using the Farasa Segmenter.

The model was trained on ~70M sentences or ~23GB of Arabic text with ~3B words.

Source Code Repository: https://github.com/aub-mind/arabert
Paper: https://www.aclweb.org/anthology/2020.osact-1.2.pdf

Results (Accuracy)

We evaluate both AraBERT models on different downstream tasks and compare it to mBERT, and other state of the art models (To the extent of our knowledge). The Tasks were Sentiment Analysis on 6 different datasets (HARD, ASTD-Balanced, ArsenTD-Lev, LABR, ArSaS), Named Entity Recognition with the ANERcorp, and Arabic Question Answering on Arabic-SQuAD and ARCD

Task	prev. SOTA	mBERT	AraBERTv0.1	AraBERTv1
HARD	95.7 ElJundi et.al.	95.7	96.2	96.1
ASTD	86.5 ElJundi et.al.	80.1	92.2	92.6
ArsenTD-Lev	52.4 ElJundi et.al.	51	58.9	59.4
AJGT	93 Dahou et.al.	83.6	93.1	93.8
LABR	87.5 Dahou et.al.	83	85.9	86.7
ANERcorp	81.7 (BiLSTM-CRF)	78.4	84.2	81.9
ARCD	mBERT	EM:34.2 F1: 61.3	EM:51.14 F1:82.13	EM:54.84 F1: 82.15

Model Weights and Vocab Download

Models	AraBERTv0.1	AraBERTv1
TensorFlow	Drive Link	Drive Link
PyTorch	Drive_Link	Drive_Link

You can find the PyTorch models in HuggingFace’s Transformer Library under the aubmindlab username

If you used this model please cite us as:

@inproceedings{antoun2020arabert,
  title={AraBERT: Transformer-based Model for Arabic Language Understanding},
  author={Antoun, Wissam and Baly, Fady and Hajj, Hazem},
  booktitle={LREC 2020 Workshop Language Resources and Evaluation Conference 11--16 May 2020},
  pages={9}
}

Acknowledgments

Thanks to TensorFlow Research Cloud (TFRC) for the free access to Cloud TPUs, couldn’t have done it without this program, and to the AUB MIND Lab Members for the continous support. Also thanks to Yakshof and Assafir for data and storage access. Another thanks for Habib Rahal (https://www.behance.net/rahalhabib), for putting a face to AraBERT.

Contacts

Wissam Antoun: Linkedin | Twitter | Github | wfa07@mail.aub.edu | wissam.antoun@gmail.com

Fady Baly: Linkedin | Twitter | Github | fgb06@mail.aub.edu | baly.fady@gmail.com

We are looking for sponsors to train BERT-Large and other Transformer models, the sponsor only needs to cover to data storage and compute cost of the generating the pretraining data

51,483 Comments

← Older Comments

Hann Online on March 28, 2025 at 10:04 am

Hann Online provides a thrilling and secure gaming environment, offering a wide variety of high-quality games. With regular bonuses and an excellent rewards system, players can enjoy seamless gameplay both at home and on the go. Hann Online ensures a hassle-free and enjoyable experience for all gaming enthusiasts.
Log in to Reply
Aroman Abundance on March 28, 2025 at 10:08 am

https://www.aromaabundance.com/
Aroma Abundance is more than a website—it’s a sanctuary for those seeking balance, healing, and purpose through the power of essential oils. Whether you’re new to oils or ready to turn your wellness passion into a thriving side business, you’ll find everything you need right here.

Explore our free video training to learn how to use essential oils for stress relief, energy, sleep, and emotional well-being—plus discover how to share these gifts with others and create a heart-centered income stream.

Your path to peace, purpose, and prosperity begins here—with the natural power of aroma.
Log in to Reply
KASHIF KHAN on March 28, 2025 at 10:14 am

I’m curious to find out what blog platform you happen to be working with? I’m having some small security issues with my latest site and I would like to find something more safeguarded. Do you have any suggestions? https://p3bet.top/
Log in to Reply
mcw on March 28, 2025 at 10:24 am

MCW là một nền tảng cá cược trực tuyến hàng đầu, nổi bật với độ uy tín và chất lượng dịch vụ vượt trội. Người chơi có thể tận hưởng không khí sôi động của cá cược thể thao với nhiều bộ môn hấp dẫn như bóng đá, bóng rổ, bóng chày, hoặc trải nghiệm những trò chơi sòng bạc trực tuyến đỉnh cao . Đặc biệt, MCW còn sở hữu một kho slot game đa dạng, nơi mỗi vòng quay có thể mang đến cơ hội đổi đời chỉ trong chớp mắt!
Log in to Reply
Hann Online on March 28, 2025 at 10:31 am

Hann Online, established in 2016, offers an immersive gaming experience with a variety of games, including slots, baccarat, and roulette. Enjoy the thrill of a luxury gaming environment from home.
Log in to Reply
williamseo on March 28, 2025 at 10:39 am

Login DOMTOTO Situs slot gacor terbaik di Indonesia. situs kami bisa memberikan RTP SLOT TOTO terbaik yang di jamin memberikan kemenangan MAXWIN untuk semua player TOP Webside 2025. slot toto
Log in to Reply
ALI SHER on March 28, 2025 at 10:48 am

Simply this site is likely to it goes without saying quite possibly possibly be renowned affecting quite a few information sites person’s, to its aware content pieces or it could be viewpoints. Orlando weight loss personal training
Log in to Reply
Ellyse Perry on March 28, 2025 at 10:52 am

This post has completely changed my understanding of this subject. Thank you for breaking it down so clearly. As someone who follows your work closely, I’d love to hear your opinion on our approach to aries leo match. If possible, please visit our website and share your thoughts
Log in to Reply
Hann Online on March 28, 2025 at 10:59 am

Hann Online, established in 2016, offers an immersive gaming experience with slots, baccarat, roulette, and more. This platform replicates the luxury and excitement of traditional gaming, accessible from home.
Log in to Reply
anonymous on March 28, 2025 at 11:03 am

This is the most amazing article I have found here 9wicket
Log in to Reply
Aroman Abundance on March 28, 2025 at 11:07 am

https://www.aromaabundance.com/
Aroma Abundance is more than a website—it’s a sanctuary for those seeking balance, healing, and purpose through the power of essential oils. Whether you’re new to oils or ready to turn your wellness passion into a thriving side business, you’ll find everything you need right here.

Explore our free video training to learn how to use essential oils for stress relief, energy, sleep, and emotional well-being—plus discover how to share these gifts with others and create a heart-centered income stream.

Your path to peace, purpose, and prosperity begins here—with the natural power of aroma.
Log in to Reply
anonymous on March 28, 2025 at 11:07 am

Open heart surgery is a major surgical procedure in which the heart or the vessels connected to the heart are directly treated with surgery. Açık Kalp Ameliyatının Riskleri
Log in to Reply
abdulla abdulla on March 28, 2025 at 11:24 am

This web page is actually a walk-through for all of the details you wished about it and didn’t know who to inquire about. Glimpse here, and you’ll absolutely discover it. Hb88
Log in to Reply
Ali ladla on March 28, 2025 at 11:26 am

This valuable appear to be most certainly good. Those very tiny facts are designed implementing a wide variety for experience know-how. That i love the reasoning behind a lot. Maitland weight loss personal training
Log in to Reply
SLOT TEMA TERPOPULER on March 28, 2025 at 11:29 am

Playing slots is now more fun because it can produce prize money in quite large amounts and with easy access for you to get, come visit our site and win your big jackpot today here https://oakacresyg.info/
Log in to Reply
ALI SHER on March 28, 2025 at 11:51 am

Easily this fabulous website may perhaps it goes without saying be well-known within many blog persons, with the aware articles or reviews or perhaps opinions. Winter Park weight loss personal training
Log in to Reply
slotbaru on March 28, 2025 at 11:58 am

Exciting adventures with fantastic jackpots await, don’t miss the opportunity to play on the players’ favorite site : slotbaru
Log in to Reply
slot baru gacor on March 28, 2025 at 12:00 pm

The most popular games make your day more beautiful, getting a sensational jackpot unexpectedly through the site slot baru gacor
Log in to Reply
cektoto on March 28, 2025 at 12:01 pm

Apart from many interesting games, you can also find the sensation of unexpected wins and see the results on the site. cektoto
Log in to Reply
Pafisorong Org on March 28, 2025 at 12:17 pm

gacor88
Log in to Reply
Pafisorong Org on March 28, 2025 at 12:31 pm

gacorbos88
Log in to Reply
suzukisparepartsin on March 28, 2025 at 12:35 pm

Looking for high-quality Suzuki Genuine Parts? Look no further than BP Auto Spares India. We offer a wide range of Suzuki parts to keep your vehicle running smoothly. Shop now for the best deals on Suzuki parts India.
Log in to Reply
Baby James on March 28, 2025 at 12:41 pm

Home Modling is a home improvement and interior design website that provides tips, ideas, and inspiration for remodeling, decorating, and enhancing living spaces. living room 3d epoxy flooring
Log in to Reply
eslot on March 28, 2025 at 12:55 pm

This site keeps going viral for a reason! The games are not only fun but also super engaging. I’ve already shared it with all my friends!
Log in to Reply
Rijschool Berkel Enschot on March 28, 2025 at 12:58 pm

Start Driving: De beste rijschool in Berkel Enschot. Niet gereden rijlessen krijg je bij ons retour. Vriendelijke & geduldige instructeurs.
Log in to Reply
www lotus365 vip on March 28, 2025 at 1:05 pm

Discover www lotus365 vip, www lotus365 vip login, and lotus book io for seamless online gaming. Explore features, benefits, and secure login today!

www lotus365 vip | www lotus365 vip login | lotus book io | Lotus365Vip | Lotus365vip Id | Lotus365vip login | Lotus365 official | Lotus365Vip site

📩 DM kare New ID lene ke liye📲  
👉🏻 https://lotus365vip.tech/demo
Log in to Reply
Ali ladla on March 28, 2025 at 1:17 pm

That appears to be without doubt terrific. A good number of teeny facts are meant experiencing great deal of background abilities. I will be interested the thing a large amount. Altamonte Springs weight loss personal training
Log in to Reply
SlotVIP on March 28, 2025 at 1:44 pm

Ingin bermain slot online dengan peluang menang tinggi? SlotVIP adalah pilihan tepat dengan berbagai promo eksklusif!
Log in to Reply
ESLOT on March 28, 2025 at 1:46 pm

Mau jackpot besar setiap hari? Main di ESLOT dan nikmati berbagai pilihan permainan dengan RTP tinggi!
Log in to Reply
Gemoy138 on March 28, 2025 at 1:47 pm

Bermain slot di GEMOY138 kini lebih seru dengan berbagai fitur unggulan dan sistem keamanan terbaik!
Log in to Reply
https://u888cem.com/ on March 28, 2025 at 1:47 pm

U888 – https://u888cem.com/ là nhà cái đến từ Châu Âu với chi nhánh tại Việt Nam, mang đến cho người chơi những sảnh game uy tín và đẳng cấp hàng đầu hiện nay. Bên cạnh đó, U888 còn có nhiều chương trình khuyến mãi hấp dẫn và dịch vụ chăm sóc khách hàng 24/7, đảm bảo trải nghiệm cá cược tốt nhất cho người tham gia.
Log in to Reply
INDOWIN88JP on March 28, 2025 at 1:48 pm

Taruhan online lebih seru di INDOWIN88JP! Dapatkan pengalaman bermain yang aman, nyaman, dan penuh keuntungan!
Log in to Reply
Veola on March 28, 2025 at 1:58 pm

When I initially commented I clicked the “Notify me when new comments are added” checkbox
and now each time a comment is added I get several emails with the same comment.
Is there any way you can remove people from that service?

Thank you!

My web-site :: situs slot gacor
Log in to Reply
Customer service on March 28, 2025 at 2:27 pm

so far the best affiliate program for me is Amazon Affiliate. they have high tier rates and great payout, online shop for coffee mugs
Log in to Reply
Scooter rijbewijs Tilburg on March 28, 2025 at 2:27 pm

Wil je snel je scooter rijbewijs in Tilburg? Start Driving biedt examen gerichte rijlessen met hoge slagingspercentages. Meld je nu aan!
Log in to Reply
Miles Baker on March 28, 2025 at 3:22 pm

What an awesome post this is. One of the best posts I’ve ever witnessed in my whole life. Wow, keep it up. Soju88 slot
Log in to Reply
jordan rabiiii on March 28, 2025 at 3:43 pm

Need a trusted handyman in Tampa? Handy Hive Pros offers affordable and reliable home repairs, maintenance, and installations. Whether it’s fixinghandyman in tampa plumbing issues, installing lighting fixtures, or routine maintenance, our skilled team can do it all. We’re committed to transparency with upfront pricing and no hidden charges. Our experienced technicians work efficiently to ensure your satisfaction. For fast, dependable service and quality work, Handy Hive Pros is your go-to choice. Get in touch today for a no-obligation estimate!
Log in to Reply
https://tructiepbongda365.com/ on March 28, 2025 at 4:05 pm

Truy cập ngay vào website https://tructiepbongda365.com/ để thưởng thức bóng đá trực tuyến với bình luận tiếng Việt chất lượng, hình ảnh Full HD sắc nét. Ngoài ra, tructiepbongda365.com còn là nơi giúp bạn: Cập nhật kết quả bóng đá mới nhất. Tham khảo nhận định bóng đá chuyên sâu. Xem video highlight trận đấu dành cho người bận rộn.
Log in to Reply