Serious games and applications for health and education
Health
We are currently working on applications for health, such as serious games and toolsets that explore HCI techniques and bio-feedback for speech therapy. Our goal is to leverage on speech and facial expression recognition combined with gaming in order to improve the effectiveness of speech therapy processes, as developed by healthcare specialists. This work is being developed with the collaboration of researchers from Carnegie Mellon University, INESC-ID, Escola Superior de Saúde de Alcoitão and VoiceInteraction. More details in the BioVisualSpeech project page.
Education We are also currently working on serious games and applications for education for blind students, such as educational computer games, orientation and mobility computer games that use spatialized sound, and a molecular editor with spatialized sound. Some of the results appear in Ferreira and Cavaco (FIE 2014) and Simões and Cavaco (ACE 2014) (which won the Bronze poster award). |
Sound synthesis
We are developing statistical methods for modeling and synthesizing
sounds with both sinusoidal and attack transient components.
We use multivariate
decomposition techniques to
learn the intrinsic structures that characterize the sound
samples. Afterwards these structures are used to synthesize
new sounds which can be drawn from the distribution
of the real original sound samples.
Some of the results appear in
Cavaco (SMC 2012).
We have also been working on the synthesis of spatialized sound (2D or 3D) for applications for the blind. |
Modeling harmonic and percussion instruments
Percussion instruments
Most musical instrument classifiers focus on distinguishing different harmonic instruments such as the violin and the flute, whose sounds have very different characteristics. On the other hand, much less attention has been given to percussion instruments, especially if we consider the discrimination of instruments of the same type, like the cymbals in a drum kit. We have been developing classifiers that are able to distinguish this latter type of instruments. In particular we have been working with cymbal sounds and we are interested in modeling, classification, transcription and synthesis of these sounds. Some of the results appear in Cavaco and Almeida (IWSSIP 2012).
Harmonic instruments Apart from the work with percussion instruments we have also been developing models that describe sounds from harmonic instruments, such as the flute, piano and guitar. Some of the results appear in Malheiro and Cavaco (INForum 2011). |
Intrinsic Structures of Impact Sounds
Models of sounds have proven useful in many fields, such as sound
synthesis, sound recognition and identification of events or
properties (like material or length) of the objects involved. However,
developing such models is hard due to all the complexities of real
sounds.
Natural sounds of the same type have a rich variability in their acoustic structure. For example, different impacts on the same rod can generate very different acoustic waveforms. In natural environments there is variability due to reverberation and background noise, but even when the sounds are recorded in anechoic conditions there is variability that is due to factors such as the slight variations in the impact force and location. (For instance, the figure on the left shows that, even though different impacts on the same rod have very similar spectra, the relative power and duration of the partials varies from one instance to the other. These differences cannot be explained by a simple variation in amplitude.) In spite of these variations, when the sounds are heard they are often perceived as almost identical, meaning that they have some common intrinsic structures. We are developing data-driven methods for learning the intrinsic features that govern the acoustic structure of impact sounds. These methods require no a priori knowledge of the physics, dynamics and acoustics, and are used to create models of impact sounds that represent a rich variety of structure and variability in the sounds. For more details see Cavaco and Lewicki (JASA 2007). |
Sound recognition
Environmental sound recognition systems are intended to distinguish
different categories of sounds, where sounds from different categories
usually have very different spectral and temporal characteristics. A
typical example of such categories is: door bells, waves, dog barking,
whistle, footsteps, keyboard, etc. These sounds are not only produced
by different types of objects but also by different types of
events. We have been investigating the possibility of building
sound recognizers (for environmental sounds and percussion instruments) that differ from the recognizers
described above as they are intended to distinguish sounds produced by
very similar objects and by the same type of event, such as impacts on metal rods (the image on the left shows that sounds from metal rods are separable) or sounds from a drum kit cymbals. Some of the results appear in
Cavaco and Rodeia (ICISP 2010) and Cavaco and Almeida (IWSSIP 2012).
In the past, we have also worked on sound recognition for robots. More specifically, we have worked on the recognition of sounds from toys for Kismet (a robot from MIT AI lab). |
Unveiling the world of color for the blind
We have developed a tool that converts color information
from still images or video frames into sound. The
tool converts the hue, saturation and value parameters into
sound parameters that influence the perception of pitch, timbre
and loudness. Our goal is to help visually impaired
individuals to perceive characteristics of the environment
that are usually not easily acquired without vision. The
tool has been experimented by visually impaired individuals,
who confirmed that it can be used to give them information
about the range of colors present in the images, presence or
absence of light sources as well the location and shape of the
objects.
Some of the results appear in
Cavaco et al. (SeGAH 2013) (which won the SeGAH 2013 best paper award) and Cavaco et al. (HCist 2013).
This project is described in more detail here.
|
Video annotation with audiovisual information
Due to the lack of annotation of their large video
archives, multimedia content provider companies and
television channels do not use the data in their archives
to their full extent. In order to contribute with a solution
to this problem, we have developed a tool that combines
audio and visual information to annotate video.
In particular, this tool has been used by a video production
company that has given us positive feedback. The
main innovation of this tool is the use of environmental
sound recognition to annotate video.
Some of the results appear in
Cavaco et al (ICALIP 2012) and Mateus et al (ICMCS 2012).
|
Music genre classification
Since today's digital content development triggered the massive use of
digital music, an indexing process is very important to guarantee a
correct organization of huge databases. While many supervised
automatic music genre classifiers have been proposed, these will
always be dependent on a previous manual labeling of the data.
Alternatively, an unsupervised approach would not have this dependency
and would be able to determine the genre of the music samples only
based on their audio features. We have been developing unsupervised
techniques for music genre classification. Some of the results appear
in
Barreira,
Cavaco and Ferreira da Silva (EPIA 2011). (The figure on the left
shows a similarity matrix from 165 music titles and 11 different
genres.)
|
Sound localization
Past projects include
• the localization of sound sources using auditory cues, such as interaural time differences and • the acoustic detection of direction of motion, that is, detecting moving sound sources and the direction of motion using only information from the sounds they produce. Some results appear in Cavaco and Hallam (CASA 99) and Cavaco and Hallam (IJNS 99). |
Echolocation
Object detection with ultrasonic sensors:
we developed a navigation controller for a mobile robot using an adaptive neural network and information from ultrasonic sensors.
|
Robotics
Some past projects involved robots, like the navigation controller for a mobile robot mentioned above, localization of sound sources for a robotic cat head and the toys' sound identifier for Kismet.
|