Train/fine-tune a classifier

If you are new to computer vision/ machine learning, see this tutorial for details on fine-tuning torchvision models. This will help you get to grips with the basic steps needed to train/fine-tune a model.

Load annotations and prepare data

Load and check annotations

First, load in your annotations using:

from mapreader import AnnotationsLoader

annotated_images = AnnotationsLoader()
annotated_images.load(annotations = "./path/to/annotations.csv")

For example, if you have set up your directory as recommended in our Input Guidance, and then saved your patches and annotations using the default settings:

#EXAMPLE
annotated_images = AnnotationsLoader()
annotated_images.load("./annotations/railspace_#rosie#.csv")

To view the data loaded in from your annotations file as a dataframe, use:

annotated_images.annotations

You will note a label_index column has been added to your dataframe.

This column contains a numerical reference number for each label. This is needed when training your model so that labels can be treated as numerical values instead of strings.

To see how your labels map to their label indices, call the annotated_images.labels_map attribute:

annotated_images.labels_map

Note

This labels_map will be needed later.

By default, this labels_map is automatically generated when loading your annotations by finding unique labels in your annotations and assigning each a numerical index. The 0 index will be assigned to the label that appears first in the annotations, 1 to the second label and so on.

Note

If you use the scramble_frame argument when loading your annotations from a file, the order of your labels will be shuffled and so the indices assigned to each label will be different each time you load your annotations.

If instead, you would like to explicitly define your labels map, you can do so by passing a dictionary to the labels_map argument when loading your annotations.

#EXAMPLE
labels_map = {0: "no", 1: "railspace", 2: "building", 3: "railspace and building"}
annotated_images.load(
    annotations="./path/to/annotations.csv",
    labels_map=labels_map
)

Now, calling the annotated_images.labels_map attribute should return the dictionary you passed in.

Note

Using the labels_map argument is important if you are doing a second round of annotations and want to ensure that the labels are consistent between the two rounds!

To view a sample of your annotated images use the show_sample method. The label_to_show argument specifies which label you would like to show.

For example, to show your “railspace” label:

#EXAMPLE
annotated_images.show_sample("railspace")
../../../_images/show_image_labels_10.png

By default, this will show you a sample of 9 images, but this can be changed by specifying num_sample.

When viewing your annotations, you may notice that you have mislabelled one of your images. The review_labels method, which returns an interactive tool for adjusting your annotations, provides an easy way to fix this:

annotated_images.review_labels()
../../../_images/review_labels.png

Note

To exit, type “exit”, “end”, or “stop” into the text box.

Prepare datasets and dataloaders

Before using your annotated images to train your model, you will first need to:

1. Split your annotated images into “train”, “val” and and, optionally, “test” datasets.

By default, when creating your “train”, “val” and “test” datasets, MapReader will split your annotated images as follows:

  • 70% train

  • 15% validate

  • 15% test

This is done using a stratified method, such that each dataset contains approximately the same proportions of each target label.

2. Define some transforms which will be applied to your images to ensure your they are in the right format.

Some default image transforms, generated using torchvision’s transforms module, are predefined in the PatchDataset class.

You can access these by calling the transform attribute on any dataset or from the PatchDataset API documentation.

3. Create dataloaders which can be used to load small batches of your dataset during training/inference and apply the transforms to each image in the batch.

In many cases, you will want to create batches which are approximately representative of your whole dataset. This requires a sampler with weights inversely proportional to the number of instances of each label within each dataset.

By default, MapReader creates a sampler with weights inversely proportional to the number of instances of each label within the “train” dataset.

Using a sampler to create representative batches is particularly important for imbalanced datasets (i.e. those which contain different numbers of each label).

To split your annotated images and create your dataloaders, use:

dataloaders = annotated_images.create_dataloaders()

By default, this will split your annotated images using the default train:val:test ratios and apply the default image transforms to each by calling the create_datasets method. It will then create a dataloader for each dataset, using a batch size of 16 and the default sampler.

To change the batch size used when creating your dataloaders, use the batch_size argument:

#EXAMPLE
dataloaders = annotated_images.create_dataloaders(batch_size=24)

If you would like to use custom settings when creating your datasets, you should call the create_datasets method directly instead of via the create_dataloaders method. You should then run the create_dataloaders method afterwards to create your dataloaders as before.

For example, to change the ratios used to split your annotations, you can specify frac_train, frac_val and frac_test:

#EXAMPLE
annotated_images.create_datasets(frac_train=0.6, frac_val=0.3, frac_test=0.1)
dataloaders = annotated_images.create_dataloaders()

This will result in a split of 60% (train), 30% (val) and 10% (test).

Advanced usage

Other arguments you may want to specify when creating your datasets include:

  • train_transform, val_transform and test_transform - By default, these are set to “train”, “val” and “test” respectively and so the default image transforms for each of these sets are applied to the images. You can define your own transforms, using torchvision’s transforms module, and apply these to your datasets by specifying the train_transform, val_transform and test_transform arguments.

  • context_dataset - By default, this is set to False and so only the patches themselves are used as inputs to the model. Setting context_dataset=True will result in datasets which return both the patches and their context as inputs for the model.

Train

Initialize ClassifierContainer

To initialize your ClassifierContainer for training, you will need to define:

  • model - The model (classifier) you would like to train.

  • labels_map - A dictionary mapping your labels to their indices (e.g. {0: "no_railspace", 1: "railspace"}). If you have loaded annotations using the method above, you can find your labels map at annotated_images.labels_map.

  • dataloaders - The dataloaders containing your train, test and val datasets.

  • device - The device you would like to use for training (e.g. "cuda", "mps" or "cpu").

There are a number of options for the model argument:

1. To load a model from torchvision.models, pass one of the model names as the ``model`` argument.

e.g. To load “resnet18”, pass "resnet18" as the model argument:

#EXAMPLE
import torch
from mapreader import ClassifierContainer

device = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'

my_classifier = ClassifierContainer("resnet18", annotated_images.labels_map, dataloaders, device=device)

By default, this will load a pretrained form of the model and reshape the last layer to output the same number of nodes as labels in your dataset. You can load an untrained model by specifying pretrained=False.

2. To load a customized model, define a torch.nn.Module and pass this as the ``model`` argument.

e.g. To load a pretrained “resnet18” and reshape the last layer:

#EXAMPLE
import torch

from torchvision import models
from torch import nn

from mapreader import ClassifierContainer

my_model = models.resnet18(pretrained=True)

# reshape the final layer (FC layer) of the neural network to output the same number of nodes as label in your dataset
num_input_features = my_model.fc.in_features
my_model.fc = nn.Linear(num_input_features, len(annotated_images.labels_map))

device = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'

my_classifier = ClassifierContainer(my_model, annotated_images.labels_map, dataloaders, device=device)

This is equivalent to passing model="resnet18" (as above) but further customizations are, of course, possible. See here for more details of how to do this.

3. To load a locally-saved model, use ``torch.load`` to load your file and then pass this as the ``model`` argument.

If you have already trained a model using MapReader, your outputs, by default, should be saved in directory called models. Within this directory will be checkpoint_X.pkl and model_checkpoint_X.pkl files. Your models are saved in the model_checkpoint_X.pkl files.

e.g. To load one of these files:

#EXAMPLE
import torch

from mapreader import ClassifierContainer

my_model = torch.load("./models/model_checkpoint_6.pkl")

device = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'

my_classifier = ClassifierContainer(my_model, annotated_images.labels_map, dataloaders, device=device)

4. To load a hugging face model, pass the model’s repository ID as a string and set ``huggingface=True``.

MapReader will automatically download the model and its corresponding image processor from the Hugging Face Hub using the transformers library.

e.g. This model is based on our *gold standard* dataset. It can be loaded directly like this:

#EXAMPLE
import torch
from mapreader import ClassifierContainer

my_model = "davanstrien/autotrain-mapreader-5000-40830105612"

device = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'

my_classifier = ClassifierContainer(my_model, annotated_images.labels_map, dataloaders, device=device, huggingface=True)

Note

You will need to install the transformers library to do this (pip install transformers).

e.g. This model is an example of one which uses the timm library. It can be loaded as follows:

#EXAMPLE
import timm
import torch

from mapreader import ClassifierContainer

my_model = timm.create_model("hf_hub:timm/resnest101e.in1k", pretrained=True, num_classes=len(annotated_images.labels_map))

device = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'

my_classifier = ClassifierContainer(my_model, annotated_images.labels_map, dataloaders, device=device)

Note

You will need to install the timm library to do this (pip install timm).

Define loss function, optimizer and scheduler

In order to train/fine-tune your model, will need to define:

1. A loss function - This works out how well your model is performing (the “loss”).

To add a loss function, use add_loss_fn. Already implemented options are “cross-entropy”, “binary cross-entropy” and “mean squared error”. You can pass these as strings:

#EXAMPLE
my_classifier.add_loss_fn("cross-entropy")

In this example, we have used PyTorch’s cross-entropy loss function as our loss function. You should change this to suit your needs.

2. An optimizer - This works out how much to adjust your model parameters by after each training cycle (“epoch”).

The initialize_optimizer method is used to add an optimizer to you ClassifierContainer (my_classifier):

my_classifier.initialize_optimizer()

The optim_type argument can be used to select the optimization algorithm. By default, this is set to “adam”, one of the most commonly used algorithms. You should change this to suit your needs.

The params2optimize argument can be used to select which parameters to optimize during training. By default, this is set to "default", meaning that all trainable parameters will be optimized.

When training/fine-tuning your model, you can either use one learning rate for all layers in your neural network or define layerwise learning rates (i.e. different learning rates for each layer in your neural network). Normally, when fine-tuning pre-trained models, layerwise learning rates are favoured, with smaller learning rates assigned to the first layers and larger learning rates assigned to later layers.

To define a list of parameters to optimize within each layer, with learning rates defined for each parameter, use:

#EXAMPLE
params2optimize = my_classifier.generate_layerwise_lrs(min_lr=1e-4, max_lr=1e-3)

By default, a linear function is used to distribute the learning rates (using min_lr for the first layer and max_lr for the last layer). This can be changed to a logarithmic function by specifying spacing="geomspace":

#EXAMPLE
params2optimize = my_classifier.generate_layerwise_lrs(min_lr=1e-4, max_lr=1e-3, spacing="geomspace")

You should then pass your params2optimize list to the initialize_optimizer method:

my_classifier.initialize_optimizer(params2optimize=params2optimize)

3. A scheduler - This defines how to adjust your learning rates during training.

To add a scheduler, use the initialize_scheduler method:

my_classifier.initialize_scheduler()

Train/fine-tune your model

To begin training/fine-tuning your model, use:

my_classifier.train()

By default, this will run through 25 training iterations. Each iteration will pass one epoch of training data (forwards step), adjust the model parameters (backwards step) and then calculate the loss using your validation dataset. The model with the least loss will then be saved in a newly created ./models directory.

The num_epochs argument can be specified to change the number of training iterations (i.e. passes through your training dataset).

e.g. to pass through 10 epochs of training data:

#EXAMPLE
my_classifier.train(num_epochs=10)

Plot metrics

Metrics are stored in a nested dictionary accessible via the metrics attribute. To list the metrics available, use:

my_classifier.list_metrics()

To help visualize the progress of your training, metrics can be plotted using the plot_metric method.

e.g. to plot the loss for all phases:

#EXAMPLE
my_classifier.plot_metric(
    metrics="loss",
)

By default, plot_metrics will plot metrics for all phases. If instead you’d like to plot metrics for just one phase, or for specific phases, you can pass the phases argument:

e.g. to plot the loss for the “train” phase only:

#EXAMPLE
my_classifier.plot_metric(
    metrics="loss",
    phases="train",
)

To plot multiple metrics at once, pass a list of metrics to the metrics argument:

e.g. to plot the precision, recall and f-scores for all phases:

#EXAMPLE
my_classifier.plot_metric(
    metrics=["precision_micro", "recall_micro", "fscore_micro"],
)

Testing

The “test” dataset can be used to test your model. This can be done using the inference method:

my_classifier.inference(set_name="test")

To see a sample of your predictions, use:

my_classifier.show_inference_sample_results(label="railspace")
../../../_images/inference_sample_results.png

Note

This will show you the transformed images which may look weird to the human eye.

By default, the show_inference_sample_results method will show you six samples of your “test” dataset. To change the number of samples shown, specify the num_samples argument.

It can be useful to see instances where your model is struggling to classify your images. This can be done using the min_conf and max_conf arguments.

e.g. To view samples where the model is less than 80% confident about its prediction:

#EXAMPLE
my_classifier.inference_sample_results("railspace", max_conf=80)

This can help you identify images that might need to be brought into your training data for further optimization of your model.

By default, when using your model for inference, metrics will not be added to your ClassifierContainerss metrics attribute. Instead, they must be added using the calculate_add_metrics.

e.g. to add metrics for the ‘test’ dataset:

#EXAMPLE
my_classifier.calculate_add_metrics(
    y_true=my_classifier.gt_label_indices,
    y_pred=my_classifier.pred_label_indices,
    y_score=my_classifier.pred_conf,
    phase="test",
)

You can then use the list_metrics method to see the metrics calculated for the “test” dataset.

my_classifier.list_metrics(phases="test")

Metrics from this inference can then be viewed using:

my_classifier.metrics["test"]["metric_to_view"]

e.g. to view the Area Under the Receiver Operating Characteristic Curve (ROC AUC) macro metric:

my_classifier.metrics["test"]["rocauc_macro"]

e.g. to view f-scores per class for each class in your labels map:

for label_id, label_name in annotated_images.labels_map.items():
    print(label_name, my_classifier.metrics["test"]["fscore_"+str(label_id)])

Saving your work

Each time you train your model, MapReader will save the best version of your model (that with the least loss) in the ./models/ directory.

If you would like to explicitly save your work, use:

my_classifier.save("file_name.pkl")

This will save both your ClassifierContainer and your model as pickle files.

e.g. :

#EXAMPLE
my_classifier.save("classifier.pkl")

This will save your ClassifierContainer as classifier.pkl and your model as model_classifier.pkl.

Infer (predict)

Once you are happy with your model’s predictions, you can then use it to predict labels on the rest of your (unannotated) patches.

To do this, you will need to create a new dataset containing your patches:

from mapreader import PatchDataset

infer = PatchDataset("./patch_df.csv", delimiter=",", transform="test")

Note

You should have created this CSV file using the convert_image(save=True) method on your MapImages object (follow instructions in the Load user guidance). This could also be a GeoJSON file.

The transform argument is used to specify which image transforms to use on your patch images. See this section for more information on transforms.

You should then add this dataset to your ClassifierContainer (my_classifier):

my_classifier.load_dataset(infer, set_name="infer")

This command will create a DataLoader from your dataset and add it to your ClassifierContainer's dataloaders attribute.

By default, the load_dataset method will create a dataloader with batch size of 16 and will not use a sampler. You can change these by specifying the batch_size and sampler arguments respectively. See this section for more information on samplers.

After loading your dataset, you can then simply run the inference method to infer the labels on the patches in your dataset:

my_classifier.inference(set_name="infer")

As with the “test” dataset, to see a sample of your predictions, use:

my_classifier.show_inference_sample_results(label="railspace", set_name="infer")

Save predictions

To save your predictions, use the save_predictions method. e.g. to save your predictions on the “infer” dataset:

my_classifier.save_predictions(set_name="infer")

Add predictions to metadata and save

To add your predictions to your patch metadata (saved in patch_df.csv), you will need to load your predictions as metadata in the MapImages object.

To do this, you will need to create a new MapImages object and load in your patches and parent images:

from mapreader import load_patches

my_maps = load_patches(patch_paths = "./path/to/patches/*png", parent_paths="./path/to/parents/*png")

You can then add your predictions to the metadata using the add_metadata method:

my_maps.add_metadata("path_to_predictions_patch_df.csv", tree_level='patch') # add dataframe as metadata

For example, to load the predictions for the “infer” dataset:

#EXAMPLE
my_maps.add_metadata("./infer_predictions_patch_df.csv", tree_level='patch')

From here, you can use the show_patches method to visualize your predictions on the parent images as shown in the Load user guide:

my_maps.add_shape()

parent_list = my_maps.list_parents()
my_maps.show_patches(
    parent_list[0],
    column_to_plot="conf",
    vmin=0,
    vmax=1,
    alpha=0.5
)

Or, if your maps are georeferenced, you can use the explore_patches method instead:

my_maps.explore_patches(
    parent_list[0],
    column_to_plot="conf",
    xyz_url="https://geo.nls.uk/mapdata3/os/6inchfirst/{z}/{x}/{y}.png",
    vmin=0,
    vmax=1,
)

Refer to the Load user guidance for further details on how these methods work.