The digital music archive has been enriched with descriptions concerning the musical parameters tempo and pitch. Music Information Retrieval (MIR) techniques have been applied: computer analysis methods that extract musical information from digital audio. The used techniques have been tested extensively and were evaluated positively, but since all the data have been created by automated processes and could not be checked individually because of the size of the archive, the results of the MIR are not infallible.
The pitch analysis has been performed by the software
Pitchdetection released by IPEM and ELIS as a result of the
MAMI-project. The software
suggests for every 1/100 second a number of (max 6) pitch
candidates. Every pitch candidate is assigned an evidence by the
software, expressing the probability of correctness. Annotations
were made in Hertz, but for the further research and graphical
representations sometimes Cent values (non-logarithmic) were
preferred above Hertz values (logarithmic). But it is important to
know that the measured values are not quantised towards Western
musical concepts. The known 12 tones of our 100 cent based musical
scale, are not used to indicate the measurements. We have decided
to utilize the exact measurements the software delivered in respect
of the exact interval distances. This approach has the advantage
that the African (ethnic) music is annotated very precise and with
regard to its original scale characteristics, but the downside of
this approach is that it becomes very hard to
‘semantise’ the measurements. This demands very
systematic and found research which was not possible within the
dekkmma-project, but can happen with the offered measurements.
The annotations are visualised in four different graphs:
The first graph visualises in a histogram all the pitches in their actual pitch
position (absolute pitch), while the second graph reduces all
pitches to one octave. For both representations pitch values are
recalculated into Cents, avoiding the logarithmic Hertz. The names
of the octaves are the American system of designation. Both graphs
show a clustering of all annotated pitches. The used scale of a
musical piece could be represented the clearest way by a histogram,
showing peaks at the pitches that are most frequent.
The third graph visualises all annotations over time that fulfil a certain
threshold. This threshold is a numeric value assigned by the
Pitchdetection software that expresses the certainty of correctness
of each annotation. A minimal threshold was used, clearing many
uncertain annotations and delivering an interesting look on whole
the musical piece. The annotations linked to each other are
represented in the same colour. The blue and red dots are the most
important annotations (mostly voice), yellow and green are
generally lower and are linked to musical instruments. Mostly the
cyan and purple dots are not referring to actual sounding pitches,
but rather to harmonics and subharmonics. Don’t focus too
much on these colours: it are automated annotations and are not
conclusive! Nevertheless, it generates a good image of the whole
musical piece, some examples: localising the start, for example
halfway the recording, of a new song revealed by a total different
organisation of the pixels, and also the entrance of a new musical
instrument is clearly visible. It also gives a clear view on the
form of the musical piece, on macro- as well as on micro level, for
example in the graph of a responsorial song, you can see when the
soloist sings and when the choir answers. The graph also visualises
good if the pitches stays stable throughout the piece, or whether
the rise or descend. If an instrument sounds very dominant or loud
in the recording, these pitches will be assigned a higher evidence.
This you can see clearly on the graphs, and gives visual feedback
on the quality of the annotation. The fourth graph zooms in on a
fragment, always the 30th till 50th second, in which only the
candidate with the highest evidence is kept. It visualises often
the lyrics, but will be more diffuse and chaotic if the annotation
is less clear (many-voiced music, no singing present, plural
instruments, loud percussion).
The 5 graphs visualise characteristics of the recording that are hard to express with
semantic terminology. Sometimes the Western musical concepts lack,
sometimes varying elements over time are easier to visualise in a
graph than in words. The added value for sure is obvious!
May it stimulate further research!