Kaosarat Aina completed her master’s degree at the University of Ibadan this year with a distinction that placed her research at doctoral standard. A bold future-forward scholar, she speaks to Guardian about her journey from Ibadan’s linguistics classrooms, the prizes that marked her years’ work, and why she believes the future of African language technology depends on a generation of scholars willing to do the hardest, least glamorous work in the field.
You completed your master’s degree at the University of Ibadan this year. Take us back to the beginning, what drew you to linguistics?
I think it was curiosity, honestly. I grew up speaking Yoruba at home and English at school, the way most of us do in southwestern Nigeria, and I was always aware of the gap between those two worlds. Not just the different words, but the different ways of thinking, the different things you could say and not say, the different ways that meaning moved. When I got to secondary school and started studying English literature, I kept noticing that the analytical tools we were given to understand language were almost entirely designed around English. The rest of our linguistic world did not seem to exist in any of the textbooks. That bothered me.
When I came to the University of Ibadan to study Linguistics and African Languages, I found a department that took that question seriously. The University of Ibadan is where modern Nigerian linguistics was built, it is where Professor Ayo Bamgbose did the foundational work on Yoruba phonology and grammar that the entire field still depends on. Walking into that environment as an undergraduate, you feel the weight of the work that had been done and the weight of the work that still needed to be done. I knew from very early on that I wanted to be part of it.
You graduated with First Class Honours in Class 2016. That was not a common distinction, particularly in a field as competitive as the University of Ibadan’s linguistics department.
It was a difficult program. The University of Ibadan does not make it easy, and I think rightly so. I pushed myself hard throughout the four years. My long essay examined the tonosyntax of the Ubang language, how tone interacts with syntactic structure in a low-resource African language in Cross River State. That was my first real encounter with the depth of the problem that would eventually become the centre of my research. Tone in African languages is not decorative. It is grammar. It is meaning. And the fact that most computational systems built to process African languages have been built without adequate engagement with that reality is a problem with consequences that extend far beyond academia.
The department honoured my graduating class in ways that I still think about. I received the Emeritus Professor Ayo Bamgbose Prize for the Best Graduating Student, a prize named for the man who quite literally invented the academic study of Yoruba phonology in Nigeria. That was humbling in a way that is difficult to describe. You receive a prize bearing the name of a foundational figure in your field and you think this is both an honour and an obligation.
Looking back, I think those recognitions shaped my sense of responsibility to the field more than they shaped my confidence. When people have invested that level of institutional faith in you, you take it seriously.
You also had student leadership responsibilities during your undergraduate years. How did that shape you?
I was involved in student associations at the departmental and faculty levels throughout my time at Ibadan. That experience taught me things that academic training alone cannot teach, how to organise people around a shared purpose, how to communicate ideas across different levels of expertise, how to advocate for something you believe in when the structures around you are not immediately supportive.
I learned that scholarships do not exist in a vacuum. The best research is research that is connected to community, that comes out of genuine engagement with the people whose lives the research addresses. My involvement in student leadership gave me that grounding early, and I think it is part of why my research has always been oriented toward practical impact rather than purely theoretical contribution.
Your master’s thesis examined the psycho-phonology of H-factor variation in Yoruba-English bilingual speech. What led you to that topic, and what does H-factor have to do with the AI questions you are now working on?
H-factor is the systematic insertion and deletion of the phoneme /h/ in the speech of Yoruba-English bilinguals, a contact phenomenon that emerges when Yoruba phonological patterns shape the production of English. “Have” becomes “ave.” “House” becomes “ouse.” “I” becomes “high”. “Ate” becomes “hate”.
It is a pattern that is deeply systematic, linguistically, and yet it is precisely the kind of pattern that speech recognition systems trained primarily on standard British or American English have never been taught to expect. When those systems encounter a Yoruba speaker producing /h/-inserted or deleted English, they misrecognize the speech at rates that are significantly higher than for standard English speakers. The consequences are real—in automated transcription, in voice-enabled services, in any context where the system’s output matters.
But more than the academic recognition that this work has received, what mattered to me was that the work was pointing toward something genuinely useful, a more accurate account of how Nigerian speakers actually sound when they speak English, and by implication, a more accurate account of what speech recognition systems need to be trained on to serve those speakers fairly.
When you look at the current state of African language technology in Nigeria and across the continent, what do you see?
I see genuine energy. Masakhane has done something remarkable in two years since it was founded, it has built a community of researchers across African countries, and I am excited to be one of them leading the charge. The Masakhane project, which produced named entity recognition datasets for African languages including Yoruba, Hausa, and Igbo, is the kind of resource that did not exist three years ago and now does. Google added Yoruba to voice search last October. These are real developments.
But I also see a structural problem that is deeper than the individual projects address, and that is the problem of foundational linguistic data. The research that has been done is mostly text-based, named entities, sentiment analysis, and machine translation. Those are important and I do not diminish them. But spoken language, which is where the most consequential technological applications live, voice assistants, speech recognition, clinical documentation, educational pronunciation tools, requires a different category of data. I am keenly studying this area, I am guessing this will be the future.
What is your aspiration for the field, looking ten years ahead?
I want to see a world in which African languages in devices have the same accuracy and reliability that other major languages have. That sounds simple. It is not. It requires annotated speech corpora. It requires computational models trained in those corpora. And it requires researchers from within the language communities, native speakers who understand the phonological architecture of these languages at a level that no amount of external analysis can replicate, to be the ones leading that work.
My aspiration is to contribute to that infrastructure in a way that outlasts me. That is what drives me, and it is what I intend to spend the next decade working toward.
Follow Us on Google News
Follow Us on Google Discover