The digital music archive has been enriched with descriptions concerning the musical parameters tempo and pitch. Music Information Retrieval (MIR) techniques have been applied: computer analysis methods that extract musical information from digital audio. The used techniques have been tested extensively and were evaluated positively, but since all the data have been created by automated processes and could not be checked individually because of the size of the archive, the results of the MIR are not infallible.
The pitch analysis has been performed by the software
Pitchdetection released by IPEM and ELIS as a result of the
MAMI-project. The software
suggests for every 1/100 second a number of (max 6) pitch
candidates. Every pitch candidate is assigned an evidence by the
software, expressing the probability of correctness. Annotations
were made in Hertz, but for the further research and graphical
representations sometimes Cent values (non-logarithmic) were
preferred above Hertz values (logarithmic). But it is important to
know that the measured values are not quantised towards Western
musical concepts. The known 12 tones of our 100 cent based musical
scale, are not used to indicate the measurements. We have decided
to utilize the exact measurements the software delivered in respect
of the exact interval distances. This approach has the advantage
that the African (ethnic) music is annotated very precise and with
regard to its original scale characteristics, but the downside of
this approach is that it becomes very hard to
‘semantise’ the measurements. This demands very
systematic and found research which was not possible within the
dekkmma-project, but can happen with the offered measurements.
The annotations are visualised in four different graphs:
The first graph visualises in a histogram all the pitches in their actual pitch
position (absolute pitch), while the second graph reduces all
pitches to one octave. For both representations pitch values are
recalculated into Cents, avoiding the logarithmic Hertz. The names
of the octaves are the American system of designation. Both graphs
show a clustering of all annotated pitches. The used scale of a
musical piece could be represented the clearest way by a histogram,
showing peaks at the pitches that are most frequent.
The third graph visualises all annotations over time that fulfil a certain threshold. This threshold is a numeric value assigned by the Pitchdetection software that expresses the certainty of correctness of each annotation. A minimal threshold was used, clearing many uncertain annotations and delivering an interesting look on whole the musical piece. The annotations linked to each other are represented in the same colour. The blue and red dots are the most important annotations (mostly voice), yellow and green are generally lower and are linked to musical instruments. Mostly the cyan and purple dots are not referring to actual sounding pitches, but rather to harmonics and subharmonics. Don’t focus too much on these colours: it are automated annotations and are not conclusive! Nevertheless, it generates a good image of the whole musical piece, some examples: localising the start, for example halfway the recording, of a new song revealed by a total different organisation of the pixels, and also the entrance of a new musical instrument is clearly visible. It also gives a clear view on the form of the musical piece, on macro- as well as on micro level, for example in the graph of a responsorial song, you can see when the soloist sings and when the choir answers. The graph also visualises good if the pitches stays stable throughout the piece, or whether the rise or descend. If an instrument sounds very dominant or loud in the recording, these pitches will be assigned a higher evidence. This you can see clearly on the graphs, and gives visual feedback on the quality of the annotation. The fourth graph zooms in on a fragment, always the 30th till 50th second, in which only the candidate with the highest evidence is kept. It visualises often the lyrics, but will be more diffuse and chaotic if the annotation is less clear (many-voiced music, no singing present, plural instruments, loud percussion).
The 5 graphs visualise characteristics of the recording that are hard to express with
semantic terminology. Sometimes the Western musical concepts lack,
sometimes varying elements over time are easier to visualise in a
graph than in words. The added value for sure is obvious!
May it stimulate further research!