All over the world, indigenous and non-indigenous languages face extinction. The Iroquoain language, Oneida, which finds a number of its speakers in New York, the Jewish language, Yiddish, as well as Sicilian, and Belarusian are all in danger.
The province of British Columbia in Canada is unique in that it is home to 60% of the country’s indigenous languages. Unfortunately, only about 4% of the indigenous people in British Columbia are fluent in their native languages, and most of those who are fluent are elderly. All of the languages face threats to their vitality.
In order to prevent the extinction of these languages, the First Peoples’ Cultural Council (FPCC), joined with the communities they advocate for, built a language revitalization platform called FirstVoices.
FirstVoices was developed to allow language revitalization champions to document their indigenous languages online and allow community members to access the information. It’s built on top of Nuxeo’s open source cloud-native content services platform and houses 38 dialects of 28 language groups on this single integrated platform.
All of the data is collected by community language champions, elders, and linguists, who are paid by FPCC-raised grants, and then it is housed in Canada on an Azure cloud.
“We’re spending our money in the communities, instead of giving thousands of dollars to developers,” says Tracey Herbert, CEO of the First Peoples’ Cultural Council. “I really think collaboration is key to language revitalization, sharing information, ideas.”
The FirstVoices’ web-based platform, which is responsive on mobile devices, stores around 400,000 JSON objects. Those objects include words, phrases, songs, stories, audio recordings, pictures, and videos.
“The audio recordings are all elders, people speaking their language the way it should be spoken,” says Daniel Yona, development manager for FPCC.
The suite of language revitalization services FPCC and First Nations people of B.C. developed extends beyond the language archive. There are dictionary and tutor apps (available offline) for iPhone and Android as well as a keyboard for First Nations languages.
As with many native languages around the world, some Inuit languages in the B.C. province don't use the Latin alphabet and their letter ordering differs from that Latin alphabet. “We had to create a custom representation of their words in order to be able to index them in a way that would be intuitive for those searching for them,” says Lisa Marcus, VP of Government, Nuxeo. “All of the language-sensitive aspects of the Nuxeo Platform are customizable -- the labels in the user interface, the data model (including vocabularies), the search index tokenizer, emails templates, etc.”
Yona says something that drew them to Nuxeo was that they wanted a company with a mature security model that would allow them to be able to implement security on a property or document level.
“Because of the cultures they serve they have many words that are sacred words -- I equate those to things like PII [personally identifiable information] for a business user. Those words are very cherished and need to be secure and only accessed by those who have the ability and right to see them,” says Marcus.
Something that makes this language research and archival project unique is that the community members own the archived data.
“Historically, linguists have come into communities and documented the languages, but also taken the knowledge from the indigenous peoples and copywritten that knowledge in dictionaries,” says Herbert.
Because the FPCC and the First Nations community members built FirstVoices and do all of the data collection, they also own all of the data.
“[FirstVoices] is about acknowledging indigenous peoples as experts in their languages and also acknowledging that we also have the capacity to use and develop technologies as Indigenous Peoples and then to curate our own data and have that data sovereignty,” says Herbert.
This need for data sovereignty was a major deciding factor as to why FPCC went with Nuxeo, an open source solution. The open source ethos of transparency closely aligned with the values of the FPCC and First Nations people.
Yona says he also chose an open source solution as a way to ensure that the project will continue beyond the founders and the technologists that are currently working on FirstVoices.
“It’s important to [be able to] say that if I’m not in this position in a few years, someone can come in and do the work with [the FirstVoices platform]. It’s not a closed source that one person or one company can build,” says Yona.
While FPCC’s main focus is supporting indigenous peoples and ensuring all of the B.C. providence indigenous languages are captured and archived and curated by indigenous experts, Herbert says that they’re always supporting indigenous peoples and have connections in Sweden, Mexico, Cuba, and have been invited to Guatemala.
“So, the word’s sort of getting out there about the tool, and we’re prepping ourselves for making it more accessible outside of B.C., but it’ll probably be a year before something like that is launched,” says Herbert.