Transfer Learning for Medical Vision
In his interview with wired.com, Andrew Ng neatly summarized one of the key challenges of Deep Learning, drawing an analogy with launching rockets:
I think AI is akin to building a rocket ship. You need a huge engine and a lot of fuel. If you have a large engine and a tiny amount of fuel, you won’t make it to orbit. If you have a tiny engine and a ton of fuel, you can’t even lift off. To build a rocket you need a huge engine and a lot of fuel.
The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.
Not Enough Data
While there has been a lot of progress in Medical Vision (Medical Imaging Analysis, Computer-aided Detection, etc) and Personalized Medicine recently, owing to tools and techniques from Deep Learning, most of it is unavailable to the general medical practitioner as neither the trained models nor the huge amount of data needed to train such models is openly shared, slowing down the adaptation from research labs to medical clinics.
This is unlike their generic counterparts like AlexNet, VGGNet or InceptionNet, which were openly shared and in turn fueled the rush for better techniques and models.
Transfer Learning?
Here is a slide from one of Andrej Karpathy’s talks:
That is some sound advice!
He is suggesting Transfer Learning, in which we either
- use a model pre-trained on a large dataset as a fixed feature extractor (by removing the last fully-connected layer or more), or
- fine-tune the weights of the pre-trained network by continuing the back-propagation.
Big impact with little data
The need of the hour is widely-shared open MedicalNets. We need deep neural nets trained and tuned over a massive number of images for general tasks, which can serve as the baseline for many new nets that the community can build for specific tasks.
For example, images of chest x-rays often aid in the detection of a wide variety of ailments:
- Tuberculosis
- Pneumonia
- Heart failure
- Lung cancer
- Lung tissue scarring (Sarcoidosis)
Building dedicated deep neural nets for the detection of each of these diseases from scratch would be very (training) time and effort consuming. On the other hand, once we train a large (with a lot of parameters) net on a large dataset for one of these diseases, it can be repurposed for the other diseases, at least fueling conversation and investigation around models for those diseases.
Call to Action
Deep Learning has the potential to transform the way healthcare helps us today, but it needs tons of data and models for doing that. Lack of access to data and models is going to deter how these benefits can flow from research labs to the common man (in developing countries).
Big Corporations and Medical Institutions, who have access to large medical datasets, should reimagine how they can benefit from sharing their datasets and models with the community (the Kaggle model seems to have been very successful). We also need a marquee dataset like the ImageNet (with permissive license, the one thing missing from Kaggle competitions) and a competition like ILSVRC in medical imaging which could bolster innovation in this space.
This becomes even more important for a developing country like India, whose ethnic groups will be left out from the fantastic innovations in medical vision and personalized medicine currently taking place. Even if we are not doing cutting-edge research here, we can at least participate by contributing meaningful datasets and let others build models that can account for our diverse ethnic groups in use-cases where it matters.
Currently I am putting together a list of openly available large medical image datasets, which I’ll share here once it is usefully large. If you know of any such datasets that could make a difference, please leave it in a comment or tweet me at @dataBiryani .
Please share this with all your Medium friends and hit that clap button below to spread it around even more. Also add any other tips or tricks that I might have missed below in the comments!