sparkle tech thoughts: ChordATune and HMM (Interactive AI Systems)

Hidden Markov Model is an AI System and also a statistical probabilistic model in which the state of the process is described by a single discrete random variable. HMMs can model to determine hidden parameters from the observable data. In HMM if the current state only depends on the previous state it is known as a first

order Hidden Markov model .

ChordATune can be modeled as a first order Hidden Markov Model, the observable data can be modeled as melody notes (pitch classes) and hidden parameters which are also known as hidden states can be modeled as chords. HMM determine the hidden state sequence (Chords) when the observable sequence is given. In western music theory we know a chord progression solemnly depends on its previous chord therefore, this system can be modeled as a first order Hidden Markov Model.

Definition: A variant of a finite state machine having a set of states , Q, an observation (output), O, transition probabilities, A, Observation probabilities, B, and initial state probabilities, Π. The current state is not observable. Instead, each state produces an output with a certain probability (B). Usually the states, Q, and outputs, O, are understood, so an HMM is said to be a triple, (A, B, Π)

Elements of HMM

Q – States
O – Observation
A – the state-transition probability matrix

B – Observation probability distribution Matrix:

p - Initial state distribution probability

λ – the entire model

Following elements are further elaborated and mapped according to ChordATune system as described below.

Hidden States –Hidden states are the chords which can be applied for a given melody. ChordATune system is designed to use only for 24 types of chords due to the scalability and the complexity (12 Major chords and 12 Minor chords) therefore, this HMM has 24 possible hidden states.

Transition Probability Matrix (A) - This holds the probabilities of possible transition from one state to another. Therefore, probability of transferring from one chord to another chord is calculated by analyzing the training data.

Observations – Observations are the set of pitch classes which belongs to a given Chord. Since melody notes can have many combinations defining set of static observations can be a complex scenario and it cannot be defined statically. Therefore, a matrix called pitch class chroma vector is used to manipulate the observations. Pitch chroma vector matrix is 12X24 (12 pitch classes and 24 states-because of the 24 chord selection) is a pitch histogram for 24 chords and probability of occurring notes per a given chord. This is calculated by calculating the probabilistic occurrence of pitch notes along with its beat for a static duration. Then mapping it with the related chord so the notes which related to each chord and there beats can be calculated by the chroma vector matrix. The observations and the observation matrix are calculated by calculating the degree of relevance between pitch chroma vector and the input melody so that observations for each hidden states can be defined. The algorithms which are used to calculate this chroma vector and the calculation of observation probability matrix.

Initial state distribution is the initial probability of occurrence for a single state at a time. This can be modeled as the probability of occurring the initial chords (start chord). This can be calculated by analyzing the training data and calculating the initial probability of each chord.

Transition probability matrix holds the probabilities of transferring one chord to another. In order to incorporate the emotional factor to the system two different transition probability matrices are maintained: Major Transition Probability matrix for happy songs and Minor Transition Probability matrix for sad songs. According to the emotional factor of the given input melody, a combination of these transition matrixes is taken as shown in the following calculation.

A_ij = P (Chord_i/Chord_i-1)

A_ij = αAmaj_ij X (1-α)Amin_ij

α – Emotional Factor

A – Transition Probability Matrix

Observation probability Matrix is populated using the input melody and the pitch class probability matrix. The pitch class probability matrix is a 12X24 matrix which has the probabilities of occurrence of each pitch classes for a given chord.

B_ij= P (MelodyBar / Chord) D_ij

(D – Pitch Class Vector)

Initial state distribution probability is the initial probability of occurrence for a single state at a time. Data driven and heuristic driven approaches are followed in order to train the model. Initial state distribution probability matrix and major minor clustering for Transition probabilities are trained using the heuristic driven approach. Transition probability matrix and pitch class probability matrix are trained using the data driven approaches. Around 250 lead sheets are used to train the ChordATune system, each of these lead sheets have monophonic melodies associated with relevant chord progression. Once the melody is processed and HMM properties are initialized according to the given melody and the emotional factor, chord progression is generated using the Viterbi algorithm.

Wednesday, January 26, 2011

ChordATune and HMM (Interactive AI Systems)

No comments:

Post a Comment