Pitch analysis

The digital music archive has been enriched with descriptions concerning the musical parameters tempo and pitch. Music Information Retrieval (MIR) techniques have been applied: computer analysis methods that extract musical information from digital audio. The used techniques have been tested extensively and were evaluated positively, but since all the data have been created by automated processes and could not be checked individually because of the size of the archive, the results of the MIR are not infallible.

The pitch analysis has been performed by the software Pitchdetection released by IPEM and ELIS as a result of the MAMI-project. The software suggests for every 1/100 second a number of (max 6) pitch candidates. Every pitch candidate is assigned an evidence by the software, expressing the probability of correctness. Annotations were made in Hertz, but for the further research and graphical representations sometimes Cent values (non-logarithmic) were preferred above Hertz values (logarithmic). But it is important to know that the measured values are not quantised towards Western musical concepts. The known 12 tones of our 100 cent based musical scale, are not used to indicate the measurements. We have decided to utilize the exact measurements the software delivered in respect of the exact interval distances. This approach has the advantage that the African (ethnic) music is annotated very precise and with regard to its original scale characteristics, but the downside of this approach is that it becomes very hard to ‘semantise’ the measurements. This demands very systematic and found research which was not possible within the dekkmma-project, but can happen with the offered measurements.
The annotations are visualised in four different graphs:

  1. tessitura, cluster of all measured pitches (tess.jpg)
  2. pitches reduced to one octave (scale.jpg) (0 = C, or Do)
  3. all pitches (above threshold) in time (allpitch.jpg)
  4. melodic fragment (pitch with highest evidence of all 6 candidates)(melo.jpg)

The first graph visualises in a histogram all the pitches in their actual pitch position (absolute pitch), while the second graph reduces all pitches to one octave. For both representations pitch values are recalculated into Cents, avoiding the logarithmic Hertz. The names of the octaves are the American system of designation. Both graphs show a clustering of all annotated pitches. The used scale of a musical piece could be represented the clearest way by a histogram, showing peaks at the pitches that are most frequent.
The third graph visualises all annotations over time that fulfil a certain threshold. This threshold is a numeric value assigned by the Pitchdetection software that expresses the certainty of correctness of each annotation. A minimal threshold was used, clearing many uncertain annotations and delivering an interesting look on whole the musical piece. The annotations linked to each other are represented in the same colour. The blue and red dots are the most important annotations (mostly voice), yellow and green are generally lower and are linked to musical instruments. Mostly the cyan and purple dots are not referring to actual sounding pitches, but rather to harmonics and subharmonics. Don’t focus too much on these colours: it are automated annotations and are not conclusive! Nevertheless, it generates a good image of the whole musical piece, some examples: localising the start, for example halfway the recording, of a new song revealed by a total different organisation of the pixels, and also the entrance of a new musical instrument is clearly visible. It also gives a clear view on the form of the musical piece, on macro- as well as on micro level, for example in the graph of a responsorial song, you can see when the soloist sings and when the choir answers. The graph also visualises good if the pitches stays stable throughout the piece, or whether the rise or descend. If an instrument sounds very dominant or loud in the recording, these pitches will be assigned a higher evidence. This you can see clearly on the graphs, and gives visual feedback on the quality of the annotation. The fourth graph zooms in on a fragment, always the 30th till 50th second, in which only the candidate with the highest evidence is kept. It visualises often the lyrics, but will be more diffuse and chaotic if the annotation is less clear (many-voiced music, no singing present, plural instruments, loud percussion).

The 5 graphs visualise characteristics of the recording that are hard to express with semantic terminology. Sometimes the Western musical concepts lack, sometimes varying elements over time are easier to visualise in a graph than in words. The added value for sure is obvious!
May it stimulate further research!