Prof Vukosi Marivate

Prof Vukosi Marivate

University of Pretoria

Prof Vukosi Marivate is a Professor of Computer Science and holds the ABSA UP Chair of Data Science at the University of Pretoria. He specialises in developing Machine Learning (ML) and Artificial Intelligence (AI) methods to extract insights from data, with a particular focus on the intersection of ML/AI and Natural Language Processing (NLP). His research is dedicated to improving the methods, tools and availability of data for local or low-resource languages. As the leader of the Data Science for Social Impact research group in the Computer Science department, Vukosi is interested in using data science to solve social challenges. He has worked on projects related to science, energy, public safety, and utilities, among others. Prof Marivate is a co-founder of Lelapa AI, an African startup focused on AI for Africans by Africans. Vukosi is a co-founder of the Masakhane Research Foundation, which aims to develop NLP technologies for African languages. Vukosi is also a co-founder of the Deep Learning Indaba, the leading grassroots Machine Learning and Artificial Intelligence conference on the African continent that aims to empower and support African researchers and practitioners in the field.

All Sessions by Prof Vukosi Marivate

Navigating AI in Open Science November 12, 2025
16:15 - 16:35

Enabling Languages through Equitable Data and Scholarship

Since 2015, my work has centered on advancing Natural Language Processing (NLP) for African and other low-resource languages, where a critical barrier is not only the scarcity of datasets but also their accessibility and ownership. Too often, linguistic resources are locked away under restrictive terms, limiting their use for research, innovation, and community benefit. In this talk, I reflect on our efforts to open up the space of African language technologies by creating, curating, and sharing datasets under equitable licenses that prioritize community rights, scholarly reuse, and long-term sustainability. I will highlight how open data practices and responsible licensing frameworks enable new forms of collaboration across academia, grassroots communities, and industry—allowing us to grow resources in ways that respect linguistic heritage while fueling technological innovation. By embedding equity and openness into our data strategies, we move beyond static language repositories toward living, shared knowledge infrastructures. The future of NLP for African languages will be defined not only by model architectures but also by the choices we make around data governance, licensing, and scholarship. Through collective stewardship, we can ensure that language technologies remain inclusive, context-aware, and truly in service of the diverse communities they aim to represent.

17:30 - 18:00

Panel