How to get started with Machine Learning in the Cloud. All the models you need to know to help you get started.
Though Machine Learning is exciting, there’s one known aspect that can be, to put it lightly, tedious. I speak of course of the time consumption in training your models. While training a model on your computer, you’re not only bottle-necked in the process of “what comes next”, you’re also stuck there with a computer that is now slow, heating up, and making hissing noises as every fan spins to the max. Training models can take hours and even days with the sad truth that sometimes the result needs tweaking, and then your computer is off to the races again, making you long for the days when your computer would compile for minutes and that was considered “long”.
Fortunately, this problem exists today, and if there’s one thing you can do easily today, it’s computation as a service. That, in its essence, is the main benefit of training, storing, and authorizing ML models in the cloud. Even if you’re reading this on a powerful machine with two graphics cards, it still makes sense that you consider the exercise of how to handle scaling that training beyond a local machine. Training a model in the cloud comes with all the benefits of other cloud-related software, and you can keep your primary machine running smoothly. Let’s go through how easy it is to train your ML model in the cloud. For this example, we’ll train an image model.
Today we’ll be using NVIDIA DIGITS to train our model in the cloud. We’ll be using this service because it can train and export Caffe, Torch, and TensorFlow models, and it even offers popular pre-trained models all wrapped into a beginner-friendly user experience. One key aspect of DIGITS that helps us here, is that it can run locally, in a cloud that you manage, or even in the AWS cloud, so you can have fine-grained control over how disconnected you’d like to be with the service. For complete beginners, it’s probably best to purchase the service from AWS Marketplace and be done. The instructions for purchasing such are beyond the scope of this article but should be very easy.
Always remember, if you’re using AWS Marketplace, it’s important to maintain your machine in the cloud so you don’t rack up a hefty monthly charge. Minimize your cloud footprint and be sure to turn off your machine when you’re done training a model for the lowest monthly bill.
Once you have your GPU fueled DIGITS setup on the cloud of your choice, you can easily access the web interface. A very clean bootstrap-style application is ready for training.
We’ll start by creating a Dataset of the images. If this is in the cloud, you’ll need to SFTP into your machine and upload your images. If you’re unfamiliar with using SFTP you can generally get things set up quickly with a universal tool like CyberDuck. If you used AWS there’s a pem-file you can use to access your instance. Make sure your files are placed in folders that fit their categorization. The filesystem separation will match the classification names for the dataset. If you want to separate your test-set (generally 10% of your images) into a separate folder you can do so now. We will show you how to configure the test and validation sets if you are managing those manually.
Once your files are in place you can create a new classification data-set.
Once you’ve selected a new image classification dataset, you can modify the form to fit the files you’ve uploaded. The checkbox “Separate validation images folder” and “Separate test images folder” reveal additionally parameters so you can appoint your testing and validation by hand (as mentioned earlier).
Within seconds after creating your dataset you’ll get a report with stats and mean data.
Your dataset is ready! You can now use it to create a model. Just as similar as it was to create a dataset, we can create the model with the “New Model” images button and “Create Classification”. The resulting form gives us a bunch of parameters that identify how we want our model created. The top section identifies how we will be preparing our inputs and configuring our solver type.
Once your model is set with the proper data and configuration, you can choose to extend a model. They provide three CNNs for you to choose from and their associated published papers. The two most likely for you to use are AlexNet and GoogLeNet.
AlexNet is an 8 layer network with 5 convolution layers and 3 fully connected layers. Lots of popular libraries use AlexNet. GoogLeNet is a 22 layer network originally codenamed “Inception”. These are great for generalized cases, but in some instances, you might have a better starting point. You can even load your own custom Caffe model to train on top. To do so you can simply upload your model like you did the training data, and then select Custom Network. Depending on your personal needs, the starting point is easily configurable.
All that’s left is to name your model and hit the create button. It’s very impressive how fast the the DIGITS interface abstracts such a complicated process. It’s even more impressive how fast these models get trained. I was able to get 50 epochs with my data in less than 20 minutes.
When a model has completed training, a report and testing page is presented. This gives you the ability to review how the training of the model went and test one or all your data in a quick and efficient manner.
The report page also provides a link for you to download your model.
Congratulations! You’ve just trained your Machine Learning model in the cloud! The benefits abound! Not only are you offloading the intensive portions of your ML creation, but you’re also able to get reports, share your history, and even test your data with lightning fashion. There are plenty of other ways to connect your Machine Learning in the cloud. You can utilize GPU and even TPU to speedily train models. Whichever service you choose it up to you, but one thing is for sure, as Machine Learning grows as a service, we can see more and more options for effective training without making your laptop so hot it fries your fingers! Now you can take advantage of the cloud for all your ML models.