Profile of the Month: Per Fallgren

Per Fallgren is a doctoral student at KTH Royal Institute of Technology’s Division of Speech, Music and Hearing. Among other things, he is developing methods for utilising the audio data stored in the archive.

How is your work linked to the National Language Bank of Sweden?
“I work on method development within the Language Bank for Speech. I am currently involved in the project Tilltal, which deals with developing automated methods for accessing the audio recordings stored in the Institute for Language and Folklore’s (ISOF’s) archives. ISOF has some 25,000 hours of recorded speech that is being digitised in parallel with our development of the tool. Among other things, I have developed a prototype tool we call Edyson.”

Tell us more about the tool!
“Edyson helps users to efficiently explore audio data, for example, if you have an audio file that you don’t know much about and wonder what type of sound or language it is. A speech recognition tool would be unable to tell us this information. Edyson, on the other hand, takes audio files and clips them into segments and creates clusters of data points based on how they sound. In this way, for example, we can distinguish speech from music, applause from laughter, find pauses or sort different speakers,” explains Per, who continues:

“This makes Edyson a cool way to gain an insight into what kind of material you are dealing with. Rather than you, the researcher, listening through a 10-hour recording, you can use Edyson as a first step to see if anything stands out. For example, we tested an audio file from ISOF’s archive that we believed contained a conversation between a man and a woman, but in which Edyson revealed something else; it turned out that part way into the conversation the man picked up a fiddle and began to play.”

Who are Edyson’s intended users?
“First and foremost, we are focusing on researchers with an interest in speech and sound material within the Tilltal project, such as dialectal sounds. The idea is to give the tool to researchers without a technical background, so that they can study and do exciting things with the audio material from ISOF’s archives. Among other things, we have a dialect researcher who will be using the tool to identify and study specific vowel sounds.”

What is the objective?
“The objective is to develop a functioning technical service that saves the researcher time and provides more accurate results than sitting listening through a 10-hour recording. That said, it is the method behind the tool that is of most interest. In comparison to traditional methods, this can save enormous amounts of time, thereby opening new opportunities to research older audio material.”

In the Profile of the Month feature, we present people whose work is related to the National Language Bank of Sweden.

Publicerad den

Uppdaterad den