Bridging the digital divide through linguistic diversity


NIGHT OWL

Anna Mae Lamentillo.jpg

 

As an MSc Major Programme Management student at the Saïd Business School, University of Oxford, my research will focus on one of the most pressing challenges in the digital age: bridging the digital divide by enhancing natural language processing (NLP) capabilities for low-resource languages with complex morphologies. For billions of people, access to digital tools in their native languages remains limited or nonexistent. This gap perpetuates social and economic disparities, and limits access to essential services, especially in regions where languages are rich in complexity but low in digital representation, such as the Philippines and numerous countries across Africa. Through my work, I aim to explore how NLP tailored to these languages can serve as a bridge to digital inclusion and economic opportunity for marginalized communities.

 

Low-resource languages often have unique characteristics that make standard NLP models, built for high-resource languages like English and Spanish, ineffective. Languages such as Tagalog, Yoruba, and Twi possess complex morphology, grammar structures, and cultural nuances that typical models fail to capture. This underrepresentation is particularly stark in regions such as Africa and Southeast Asia, where linguistic diversity is vast. Without proper NLP models, speakers of these languages face additional barriers to digital literacy, excluding them from education, healthcare, and civic engagement available through digital channels.

 

A major part of my research will involve studying how artificial intelligence (AI) can be trained to overcome linguistic and data-related challenges unique to low-resource languages. This focus aligns directly with the mission of my startup, NightOwlGPT, a platform specifically designed to support marginalized languages. NightOwlGPT’s approach, which began in the Philippines with languages like Tagalog and Cebuano and is now expanding to countries in Africa, prioritizes language preservation and accessibility. By engaging with underrepresented languages, NightOwlGPT demonstrates how AI can create meaningful connections between people and digital resources in their native languages, thereby facilitating social and economic growth in underserved communities.

 

The stakes are high. Low-resource language speakers, including millions across Philippines, Ghana, Kenya, and Nigeria, often rely on oral traditions, which are at risk of being lost in the absence of digital preservation. My research will build on the work of platforms like NightOwlGPT by investigating techniques for collecting and structuring data that accurately represent these languages. For example, African languages often incorporate tonal distinctions that shift meaning based on pitch, while Filipino languages may use affixations that add layers of meaning to root words. Training AI to understand these intricacies will not only provide more accurate NLP tools but also contribute to cultural preservation.

 

Addressing this issue is about more than technology – it’s about creating an inclusive digital world that serves all languages. By training NLP to understand low-resource languages, we not only bridge a technological gap but also foster digital equity, allowing diverse communities to engage fully in today’s digital landscape. At Saïd Business School, my research will strive to highlight and close this gap, advocating for a digital environment where linguistic diversity is both preserved and celebrated. AI has the potential to be transformative for all communities, and with enhanced NLP tailored to low-resource languages, we can ensure that everyone has a voice in the digital age.