By Hussameddine Al Attar | Staff Writer
Much of what we know of the ancient world comes from surviving texts dating back thousands of years. They have provided us with invaluable information regarding the languages, politics, literatures, and philosophies of ancient civilizations–information which would have been largely unknown to us had the people of the past not documented it. However, not all texts withstood the test of time. While we have historians to thank for reconstructing and interpreting found texts, their job becomes increasingly difficult when the texts are damaged to the point of near illegibility and when modern dating methods, such as radiocarbon dating, fail altogether due to the inorganic nature of the material or its contamination.
Google’s Deepmind has thus unveiled Ithaca, a deep neural network model trained to restore Greek inscriptions, pinpoint their places of origin, and predict the dates they were written with great accuracy. Alone, Ithaca restores texts with a 62% accuracy rate. Moreover, Ithaca can locate the origin of texts with a 71% accuracy rate and can date them within 30 years of their true date ranges. On average, the dates predicted by the neural network are only 5 years off historians’ proposed dates.
Ithaca, named after the Greek island in Homer’s Odyssey, was trained on a dataset of just under 80,000 Greek inscriptions. The team behind Ithaca processed the Packard Humanities Institute (PHI) dataset, consisting of 178,551 inscriptions and 84 regions, from which they derived a standardised, machine-actionable dataset. This resulting dataset (I.PHI) is, according to the Ithaca team, the “largest multitask dataset of machine-actionable epigraphical text, containing 78,608 inscriptions”.
With such an advancement in artificial intelligence, historical debates may soon no longer be exclusive to historians. In fact, Ithaca has recently taken sides in a historical dispute over the correct date of several Athenian decrees–a dispute that determines much of our understanding of Athenian politics of the time. The decrees had long been assumed to date back to 446/445 BCE, but new evidence suggests that the decrees may have instead originated sooner, in the 420s BCE, calling the previous notion into question and raising doubts regarding its validity. When Ithaca was presented with the texts, it dated them to 421 BCE, aligning itself with the rising theory. Predictions such as these may be used to sway, if not settle, historical debates and disputes fundamental to our study of Ancient Greece.
With that said, it must be noted that Ithaca’s performance may vary noticeably with regard to the texts presented to it. Ithaca’s ability to provide accurate predictions is, as is the case with most neural networks, almost entirely dependent on the training data used, and will thus share its shortcomings. In fact, the data constituting the I.PHI dataset is not uniformly distributed across the Greek regions. Most of the inscriptions used to train Ithaca come from Athens, with most of them specifically legal decrees. This bias in the dataset will surely reflect on the neural network’s performance, as we can expect Ithaca to be more competent in restoring and analysing Athenian decrees than other types of inscriptions, such as poetic verses or Spartan texts. That is simply because not enough of those inscriptions are included in the training data to balance out the Athenian bias, since not enough of those inscriptions have been uncovered.
Ithaca was never made to be a replacement for historians, but rather an additional tool at their disposal. When combining Ithaca’s ability with that of historians, an astounding 72% accuracy was achieved, far surpassing the accuracy of each alone, with Ithaca’s being 62% and historians’ 25%. Moreover, Ithaca does not simply return one prediction, but rather a list of predictions ranked according to probability, presenting historians with different hypotheses they might explore–perhaps ones that nobody had considered before.
Whether the Ithaca project will ever be expanded beyond the borders of Greece remains undetermined, but with the incredible rate at which deep learning is advancing, one can always expect fascinating news.
Google has made a free, interactive version of Ithaca available: https://ithaca.deepmind.com/
Google has made the source code available: github.com/deepmind/ithaca
Read the blog post: https://www.deepmind.com/blog/predicting-the-past-with-ithaca
Read the publication: https://www.nature.com/articles/s41586-022-04448-zer