Why Deep Learning is Now Easy for Data Scientists
Deep learning has seeped into every nook and cranny of modern lives. From virtual assistants like Siri and Cortana to chatbots or service bots to personalized entertainment, deep learning is already working its way into our lives.
Deep learning is powerful because it makes difficult things easy, no wonder why it is one of the important tools for data scientists. The reason why deep learning made a splash in recent times is because of the fact that it allowed the data science team to pave its way in the functions mentioned below:
- Access to large datasets
- Funds for a larger-scale model training
- And a novel model architecture designed in-house
However, these functions restricted further progress in deep learning, limiting it to certain projects that met these conditions. But things have changed over the last few years. In recent times, users have started launching a new generation of products specifically built on deep learning. And unlike before, these products aren’t limited to using just a single kind of model architecture.
Thus, the major driver behind this development is “transfer learning.”
Transfer learning model: how it functions
Transfer learning is a deep learning technique that helps developers harness neural networks used for one task and use it for a different domain. It is a machine learning method where a model is developed for a task and it is reused as a starting point for a model on the second task. Transfer learning completely differs from a traditional machine learning model. Simply put, the model developed is used for another task to jumpstart a new development process on a new task or a new problem.
For instance, image recognition. Let’s say you want to identify horses, but publicly there aren’t any algorithms that can do an adequate job. However, with transfer learning you can begin with an existing convolutional neural network, this is commonly used for image recognition of other animals, then you can tweak this convolutional neural network to train with horses. The transfer learning model isn’t just used for image recognition. A recurrent neural network used in speech recognition can easily take advantage of transfer learning.
How does it function?
Transfer learning uses the approach where knowledge learned from one or more source tasks is transferred and used to improve the learning of a related target. Whereas, in terms of machine learning algorithms, these algorithms are designed to address just a single task.
In transfer learning, the first step involves the training of a base network on a base dataset and task, to which the learned features are repurposed or transferred to the second target network to be trained on the target dataset and task. This process will function properly if the features are general, that is if it is suitable to both the base and target tasks, instead of being specific to a single task.
How transfer learning becomes the key to the upcoming generation of machine learning powered software?
Favorable conditions should be maintained for machine learning algorithms and deep learning to be used effectively — access to a clean dataset, able to design an effective model and train it, and access to a larger dataset.
This means, by default projects in certain domains without the resources didn’t find it feasible to function properly. Now with the help of transfer learning, these roadblocks can be removed.
How Models can now be trained in minutes and not days?
Training models having a huge amount of data isn’t just going to consume a large dataset but will cost you your time and resource. For instance, when Google developed image classification model Xception, they had to train two models — ImageNet dataset with 14 million images and JFT dataset with 350 million images.
With the help of 60 NVIDIA K80 GPUs and various optimizations, the training took three days for a single ImageNet experiment to run and JFT took more than a month. And since the pre-trained Xception model was released, the team finetune their version much faster.
A team at the University of Illinois and Argonne National Laboratory recently trained a model that could identify images of the galaxy as spiral and elliptical. Although they had a dataset of 35000 labeled images they could finetune Xception within 8 minutes with the help of NVIDIA GPUs. As a result, this model can easily identify galaxies with a 99.8 percent success rate at a superhuman rate of 20000 galaxies speed per minute, when served on GPUs.
The small dataset will no longer be a deal-breaker
Typically, deep learning needs large amount of labeled data. As we know, such data do not simply exist. Well, transfer learning can now easily fix this. For instance, a team that was affiliated to the Harvard Medical School recently deployed a model that predicts long-term mortality, chest radiographs, and noncancer deaths. Despite not having the data necessary to train the CNN (convolutional neural network) they still managed by using pre-trained Inception model-v4 with the usage of transfer learning and few architectural modifications to adapt to the model dataset.
Down the line, ML engineers will only need to worry about putting these machine learning models into production.
Originally published at https://www.lifeandexperiences.com on July 14, 2020.