Article

Saving endangered languages, one font at a time

The digital age could hasten the extinction of many languages. A Stanford-led consortium of academics and tech experts hopes to save them.
Saving endangered languages, one font at a time

Flip open a phone or laptop, and you’re instantly connected to a world of knowledge, entertainment, and opportunities.

But imagine if you depended on one of the 97 percent of global languages that are digitally disadvantaged?

For millions around the world whose languages and alphabets lack digital texts or fonts, much of modern life is off limits. And in a fast-changing, hyper-connected world, it threatens to hasten the extinction of many more languages — and with them their deep wealth of culture, art, and wisdom.

Given this perilous global picture, linguistic inclusivity is emerging as a critical element to closing the digital divide. But more needs to be done.

“Information technologies predominantly built in the United States take as their starting point the English language and the Latin alphabet,” said Tom Mullaney, a Stanford University history professor who specializes in East Asian languages and cultures. “To the extent that another writing system does not fit that model, those communities have been placed at a major disadvantage with regard to every modern communication and computing technology.”

Mullaney, who has authored or co-authored seven books, including The Chinese Typewriter, a history of Chinese-language computing, helps lead a new project called SILICON (Stanford Initiative on Language Inclusion and Conservation in Old and New Media). It aims to bring together all the expertise needed to tackle this problem, including by adding more languages to the Unicode standard for digital text and characters.

“The problem requires an interdisciplinary approach,” Mullaney continued. “For example, no hotshot developer, programmer, or human/computer interaction researcher can crack the code alone. They would need in-close consultations with linguists, archaeologists, writing specialists, anthropologists, user-interface designers, AI researchers, and members of the communities themselves. It needs to be the perfect merger.

Leveling the playing field for digitally disadvantaged languages

SILICON is striving to create that perfect merger, as Dr. Kathryn Starkey, a professor of German and medieval studies and a SILICON co-founder, explained.

“Our goal is to help level the playing field for languages beyond English,” she said, “so people can communicate comfortably and more seamlessly in their daily life, in any language of their choice.”

With its vast academic scope and location in the heart of Silicon Valley, Stanford University is uniquely positioned to lead this effort.

“Stanford has a long history of world-class research on and engagement with the world’s languages, literatures, and cultures,” Starkey said, “both past and present. We have people working at the cutting edge of design, as well as deep ties to the Silicon Valley companies that are a crucial part of implementing broader linguistic inclusion across the products we use every day.”

Indeed, more tech industry leaders are noticing. Among them is Denise Lee, vice president for Cisco’s engineering sustainability office and a passionate advocate for language inclusion. She stressed how SILICON’s work aligns with Cisco’s core purpose of powering an inclusive future — including its commitment to closing the digital divide.

“If we think about the work that SILICON’s doing, it bumps right up against what Cisco's trying to do,” she said. “That is, close the digital divide. And give millions more people in the world an opportunity to learn, grow, and join the global economy.”

However, to build language inclusivity more support is needed — from the tech community and beyond.

“The technology sector has a unique and important role to play,” Starkey said. “But there’s also roles for language communities to play in helping translate common user-interface text, and roles for keyboard and font designers to play in making it easier to input text, and making that text visually appealing. Our goal is to help bridge these various communities and offer the support they need to realize this vision.”

When languages are saved, we all benefit

Of course, closing any aspect of the digital divide benefits everyone. These cultures have much to offer the world, all of which could be lost if their languages go extinct.

“When you lose your language, your culture is beyond fractured,” Lee warned. “Historically, eradicating languages has been a way for colonizers and oppressors to keep people down.”

As technology continues to progress at an ever-accelerating pace, the dangers will only increase, as Mullaney fears.

“The stakes are already high in the digital age for languages that are digitally disadvantaged,” he said. “But those stakes just got much higher because we're witnessing a complete sea change in human/computer interaction itself. As we move towards conversational, chatbot style interaction, then forget keyboards, forget conventional input devices. The divide will become even wider, because then those 100 languages for which there is robust data, will pull away even further from the pack. So, there is a growing sense of urgency around this new era of computing.”

Starkey shared her own thoughts on the impact of artificial intelligence, including its potential to support linguistic inclusivity.

“For any kind of AI model to work, you need digital text, and quite a lot of it,” she explained. “That’s not possible for languages whose scripts aren’t a part of Unicode. And there are many languages that can be written digitally, but most of their text is locked up in printed books, without any way to reliably transform that text into something digitally usable. We need algorithms that can take scans of these books and turn them into digital text to open up the advantages AI can offer to more linguistic communities.”

All agreed there is still time. But fast, collective action — like that being driven by SILICON — is essential, if languages are to be saved.

“There is so much to be done,” Starkey argued, “and linguistic inclusion work is vastly under-resourced. We’re starting with a set of pilot projects to get a better sense of what is involved in supporting keyboard or font design — or improving the support for a new language in Unicode’s localization database. These pilot projects will help shape our next steps.”

Those next steps will be critical, if global culture is to retain the riches contained in so many threatened languages.

“These languages have deep historical, cultural, social relevance and value,” Mullaney concluded. “The wisdom and experience that would be lost if humanity doesn't get this right is incalculable. But the flip side of that is that the richness of human experience that can be brought together, made visible, and rolled into the human collective is also incalculable. But only if we get it right.”