Generating New Music with Deep Learning: An Introduction to Music Generation with RNNs in Python + Keras

Music generation is a fascinating application of deep learning, where we can teach machines to create new music based on patterns and structures in existing music. Deep learning models such as recurrent neural networks (RNNs) and generative adversarial networks (GANs) have been used for music generation.

In this tutorial, we will use Python and the Keras library to generate new music using an RNN.

Music Generation with RNNs in Python and Keras

Import Libraries

We will start by importing the necessary libraries, including Keras for building the model and music21 for working with music data.

import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense, Dropout
from music21 import converter, instrument, note, chord, stream

Load and Prepare Data

Next, we will load the music data and prepare it for use in the model.

# Load music data
midi = converter.parse('path/to/midi/file.mid')

# Extract notes and chords
notes = []
for element in midi.flat:
    if isinstance(element, note.Note):
        notes.append(str(element.pitch))
    elif isinstance(element, chord.Chord):
        notes.append('.'.join(str(n) for n in element.normalOrder))
# Define vocabulary
pitchnames = sorted(set(item for item in notes))
note_to_int = dict((note, number) for number, note in enumerate(pitchnames))
# Convert notes to integers
sequence_length = 100
network_input = []
network_output = []
for i in range(0, len(notes) - sequence_length, 1):
    sequence_in = notes[i:i + sequence_length]
    sequence_out = notes[i + sequence_length]
    network_input.append([note_to_int[char] for char in sequence_in])
    network_output.append(note_to_int[sequence_out])
n_patterns = len(network_input)
n_vocab = len(set(notes))
# Reshape input data
X = np.reshape(network_input, (n_patterns, sequence_length, 1))
X = X / float(n_vocab)
# One-hot encode output data
y = to_categorical(network_output)

In this example, we load the music data from a MIDI file and extract notes and chords. We then define a vocabulary of unique notes and chords and convert them to integers. We create input and output sequences of fixed length and one-hot encode the output data.

Build Model

Next, we will build the RNN model for music generation.

# Define model
model = Sequential()
model.add(LSTM(512, input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(512))
model.add(Dense(256))
model.add(Dropout(0.3))
model.add(Dense(n_vocab, activation='softmax'))

# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam')

In this example, we define the RNN model with two LSTM layers and two dropout layers for regularization.

Train Model

Next, we will train the model on the prepared music data.

# Train model
model.fit(X, y, epochs=100, batch_size=64)

In this example, we train the model on the input and output sequences of the prepared music data.

Generate New Music

Finally, we can use the trained model to generate new music.

# Generate new music
start = np.random.randint(0, len(network_input)-1)
int_to_note = dict((number, note) for number, note in enumerate(pitchnames))
pattern = network_input[start]
prediction_output = []

# Generate notes
for note_index in range(500):
    prediction_input = np.reshape(pattern, (1, len(pattern), 1))
    prediction_input = prediction_input / float(n_vocab)
    prediction = model.predict(prediction_input, verbose=0)
    index = np.argmax(prediction)
    result = int_to_note[index]
    prediction_output.append(result)
    pattern.append(index)
    pattern = pattern[1:len(pattern)]

# Create MIDI file
offset = 0
output_notes = []
for pattern in prediction_output:
    if ('.' in pattern) or pattern.isdigit():
        notes_in_chord = pattern.split('.')
        notes = []
        for current_note in notes_in_chord:
            new_note = note.Note(int(current_note))
            new_note.storedInstrument = instrument.Piano()
            notes.append(new_note)
        new_chord = chord.Chord(notes)
        new_chord.offset = offset
        output_notes.append(new_chord)
    else:
        new_note = note.Note(int(pattern))
        new_note.offset = offset
        new_note.storedInstrument = instrument.Piano()
        output_notes.append(new_note)
    offset += 0.5

midi_stream = stream.Stream(output_notes)
midi_stream.write('midi', fp='output.mid')

In this example, we generate new music by randomly selecting a starting sequence from the prepared music data and predicting the next note at each time step using the trained RNN model. We then create a MIDI file from the generated notes.

With the help of deep learning, we can now create new music based on patterns and structures in existing music.

LyronFoster

Lyron Foster is a Hawai’i based African American Author, Musician, Actor, Blogger, Philanthropist and Multinational Serial Tech Entrepreneur.