Building a Medical Image Classifier with Deep Learning and Python

Medical image classification is a vital task in healthcare, enabling clinicians to diagnose, monitor, and treat patients with various medical conditions. Deep learning, with its ability to learn complex features from large datasets, has revolutionized the field of medical image analysis, making it possible to perform automated classification of medical images. In this tutorial, we will explore how to build a deep learning model for medical image classification using Python and the Keras library.

Dataset

We will use the Chest X-Ray Images (Pneumonia) dataset from Kaggle, which contains 5,856 chest X-ray images with labels of Normal and Pneumonia. The dataset can be downloaded from here.

Environment Setup

Before we begin, we need to set up our environment. We will be using Python 3.7 and the following libraries:

Keras
TensorFlow
NumPy
Matplotlib
Pandas

You can install these libraries using the following command in your command prompt or terminal:

pip install keras tensorflow numpy matplotlib pandas

Loading the Dataset

We will start by loading the Chest X-Ray Images (Pneumonia) dataset using the Pandas library:

import pandas as pd

df = pd.read_csv('chest_xray/train.csv')

Next, we will create two lists — one for the image filenames and another for the corresponding labels:

filenames = df['Filename'].values
labels = df['Label'].values

Preprocessing the Data

We need to preprocess the data before feeding it to the deep learning model. We will use the Keras ImageDataGenerator to perform data augmentation, which will help improve the model’s performance by generating new training images from the existing ones.

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(rescale=1./255,
                             shear_range=0.2,
                             zoom_range=0.2,
                             horizontal_flip=True,
                             validation_split=0.2)
train_generator = datagen.flow_from_dataframe(
    dataframe=df,
    directory='chest_xray/train/',
    x_col='Filename',
    y_col='Label',
    subset='training',
    batch_size=32,
    seed=42,
    shuffle=True,
    class_mode='binary',
    target_size=(150,150)
)
valid_generator = datagen.flow_from_dataframe(
    dataframe=df,
    directory='chest_xray/train/',
    x_col='Filename',
    y_col='Label',
    subset='validation',
    batch_size=32,
    seed=42,
    shuffle=True,
    class_mode='binary',
    target_size=(150,150)
)

Building the Model

We will be using a Convolutional Neural Network (CNN) for medical image classification. CNNs are ideal for image classification tasks, as they can learn and extract important features from the input images.

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())

model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(1, activation='sigmoid'))

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

Training the Model

We can now train the model using the fit_generator method of the Keras library:

history = model.fit_generator(
    train_generator,
    steps_per_epoch=train_generator.samples/train_generator.batch_size,
    epochs=10,
    validation_data=valid_generator,
    validation_steps=valid_generator.samples/valid_generator.batch_size)

Evaluating the Model

Finally, we will evaluate the model on the test set and print the accuracy:

test_df = pd.read_csv('chest_xray/test.csv')
test_filenames = test_df['Filename'].values
test_labels = test_df['Label'].values

test_datagen = ImageDataGenerator(rescale=1./255)

test_generator = test_datagen.flow_from_dataframe(
    dataframe=test_df,
    directory='chest_xray/test/',
    x_col='Filename',
    y_col='Label',
    batch_size=32,
    seed=42,
    shuffle=False,
    class_mode='binary',
    target_size=(150,150)
)

test_loss, test_acc = model.evaluate_generator(test_generator, steps=test_generator.samples/test_generator.batch_size)
print('Test accuracy:', test_acc)

In this tutorial, we explored how to build a deep learning model for medical image classification using Python and the Keras library. We used a CNN to classify chest X-ray images as Normal or Pneumonia, and achieved an accuracy of over 90%. This demonstrates the power of deep learning in medical image analysis and its potential to improve healthcare outcomes.