Curating the data curation tool

Jul 7, 2022 | Newsletter #3

by Advantis Medical Imaging


Curating datasets is not an easy task. On the contrary, it can bring about hard-to-solve problems and lead to open-ended discussions, until a  satisfactory solution is found. In ProCAncer-I, the curation of medical images has been a challenging task both in terms of computational efficiency as well as in finding a solution applicable among dissimilar images. In this article, we are going to attempt to succinctly describe some of the challenges aside from the technical intricacies.


First of all, ProCAncer-I’s curation tool performs two functions: inter-volume motion-correction of a 4D series and co-registration of a series with a T2w (axial) image. Both functions attempt to solve an optimization problem, which may be potentially a time-consuming task and expensive in terms of computing power. Especially, the co-registration function, which may be used to align 3D images with potentially dissimilar characteristics, e.g. quite different signal intensities, such as a T2w and a DW image, can lead to long processing times, until the underlying image registration algorithm reaches a solution. A most common case is the co-registration of a DW high b-value image with a very low signal intensity and a T2w image. In such a case, the co-registration of the two images could fail, which is also the reason such a process cannot be easily fully automated, but rather requires human intervention and inspection of the results. Additionally, tweaking the hyperparameters of the underlying registration algorithm in order to re-run either the motion-correction or co-registration step may seem like an intimidating task due to the fact that these advanced settings require users to understand concepts, such as the number of iterations needed to solve an optimization problem, smoothing using a Gaussian kernel, and image downsampling.


In general, the curation functions can be intensive processing tasks, challenging to fully understand their internals, and intimidating to re-configure in order to improve results. However, various design decisions have been made to improve the overall performance and user experience. They include a simplistic UI, which guides users through the entire curation process step-by-step without the requirement to make intermediate decisions apart from triggering each curation step and assessing the results. Each curation function uses sane defaults, which may be re-configured in an Advanced Settings menu.


Furthermore, a plethora of code optimizations have led to reduced processing times and results’ improvements. In the future, intelligently re-using previous computed transformations can lead to further improving results, e.g. exploiting an image’s successful co-registration to also co-register a derived image of low signal intensity, for which such function would otherwise fail. In conclusion, the curation of medical images is not always a straight-forward task. But, our goal is to curate the curation process, too!