Introduction to Amazon Polly Introduction to Amazon Rekognition
Introducing Amazon SageMaker Introduction to Amazon Comprehend
10-Minute Tutorials
Start with these free and simple tutorials to explore AWS machine learning services
In this tutorial, you will use the Amazon Polly for WordPress plugin to add text-to-speech capability to a WordPress installation. Amazon Polly is a service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice, enabling you to create applications that talk, and build entirely new categories of speech-enabled products.
Detect, Analyze, and Compare faces with Amazon RekognitionIn this tutorial, you will learn how to use the face recognition features in Amazon Rekognition using the AWS Console. Amazon Rekognition is a deep learning-based image and video analysis service.
Analyze Video and Extract Rich Metadata with Amazon RekognitionIn this tutorial, you will learn how to use the video analysis features in Amazon Rekognition Video using the AWS Console. Amazon Rekognition Video is a deep learning powered video analysis service that detects activities and recognizes objects, celebrities, and inappropriate content.
Analyze Sentiment in Text with Amazon ComprehendIn this step-by-step tutorial, you will learn how to use Amazon Comprehend for sentiment analysis.
Amazon Comprehend uses machine learning to find insights and relationships in text. Amazon Comprehend provides keyphrase extraction, sentiment analysis, entity recognition, topic modeling, and language detection APIs so you can easily integrate natural language processing into your applications.
There are no unlimited free services*, but some have starting credit or free offers on initial signup. Here are some suggested to date:
AWS: If specifically deep learning on a large data set, then probably AWS is out - their free offer does not cover machines with enough processing power to tackle deep learning projects.
Google Cloud might do, the starting credit offer is good enough to do a little deep learning (for maybe a couple of weeks), although they have signup and tax restrictions.
Azure have a free tier with limited processing and storage options.
Most free offerings appear to follow the "Freemium" model - give you limited service that you can learn to use and maybe like. However not enough to use heavily (for e.g. training an image recogniser or NLP model from scratch) unless you are willing to pay.
This best advice is to shop around for a best starting offer and best price. A review of services is not suitable here, as it will get out of date quickly and not a good use of Stack Exchange. But you can find similar questions on Quora and other sites - your best bet is to do a web search for "cloud compute services for deep learning" or similar and expect to spend some time comparing notes. A few specialist deep learning services have popped up recently such as Nimbix or FloydHub, and there are also the big players such as Azure, AWS, Google Cloud.
You won't find anything completely free and unencumbered, and if you want to do this routinely and have time to build and maintain hardware then it is cheaper to buy your own equipment in the long run - at least at a personal level.
To decide whether to pay for cloud or build your own, then consider a typical price for a cloud machine suitable for performing deep learning at around \$1 per hour (prices do vary a lot though, and it is worth shopping around, if only to find a spec that matches your problem). There may be additional fees for storage and data transfer. Compare that to pre-built deep learning machines costing from \$2000, or building your own for \$1000 - such machines might not be 100% comparable, but if you are working by yourself then the payback point is going to be after only a few months use. Although don't forget the electricity costs - a powerful machine can draw 0.5kW whilst being heavily used, so this adds up to more than you might expect.
The advantages of cloud computing are that someone else does the maintenance work and takes on the risk of hardware failure. These are valuable services, and priced accordingly.
* But see Jay Speidall's answer about Google's colab service, which appears to be free to use, but may have some T&C limitations which may affect you (for instance I doubt they will be happy for you to run content production of Deep Dream or Style Transfer on it)
Batteries included. Comes with Jupyter notebook and pre-installed python packages
Photo by Caspar Camille Rubin on UnsplashIntroduction
There are various reasons why we would choose to use cloud CPUs over CPU in our local machine.
- CPU workload: The first and most obvious advantage is using a cloud CPU can free up CPU workload from your local machine. This is especially beneficial is you have an old CPU in your local machine.
- Code Sharing: Using Jupyter notebooks hosted in the cloud makes sharing of code easier. Simply make it pubic and share the URL to the notebook.
- Storage: These platforms also offer storage spaces for your data which helps free up storage in your local machine.
In this article we will look at 3 platforms which offers Jupyter notebook IDE with free and unlimited CPU usage time for your machine learning projects.
Databricks
Databricks is a Data Science, Data Engineering and Machine Learning platform used by Data Scientist, Analyst and Engineers for developing and deploying ETL pipelines, machine learning models and data analysis. Databricks offers free community edition account where you get your own workspace with a cloud hosted Jupyter notebook (aka Databricks notebook). To sign up for Databricks Community Edition:
- Go to: //databricks.com/try-databricks
- Fill in your details
3. Click on “Get Started with Community Edition”
4. Verify your email address
Create a Cluster
After logging in you will see your the following home page. We need an active cluster to start working on datasets. Lets create one.
- Go to the left panel and click on “Compute”
2. Click on “Create Cluster”
Image by Author3. Give the cluster a name, select a Databricks runtime version and click “Create Cluster”
Image by AuthorCommunity edition users are entitled to a driver node (2 cores) with 15GB RAM and no worker node. Databricks ML runtime supports Scala, Spark (Pyspark) and has pre-installed commonly used data science python packages such as pandas, numpy, scikit-learn etc.
Upload Data
- Lets use the Iris dataset as an example. We can download it from UCI repository.
- Extract the zip file. The data is in CSV format.
- On Databrick’s left panel click on “Data” tab
4. To upload the CSV file to Databricks, click on “Upload File”
5. Browse to select file or simply drag and drop it into the grey box
6. Click “Create Table in Notebook”
Image by Author7. In the notebook cells, change infer_schema and first_row_is_header to True and delimiter to ;
# File location and typefile_location = "/FileStore/tables/bank_full.csv"
file_type = "csv"# CSV options
infer_schema = "True" # change to True
first_row_is_header = "True" # change to True
delimiter = ";" # Change to ;# The applied options are for CSV files. For other file types, these will be ignored.
df = spark.read.format(file_type) \\
.option("inferSchema", infer_schema) \\
.option("header", first_row_is_header) \\
.option("sep", delimiter) \\
.load(file_location)
8. In the last cell you can name the table using the variable permanent_table_name and write the dataframe df to the table
permanent_table_name = "bank_marketing"df.write.format("parquet").saveAsTable(permanent_table_name)
9. This will create a new table under the data tab in the left panel
Image by Author10. Now we can use this table in a new notebook. Go to Create in the left tab and create a notebook. Assign the notebook a Name and Cluster
11. Read the table into the new notebook
Image by AuthorShare Notebooks
To share notebooks click on publish button located at the top right of the notebook
Image by AuthorGoogle Colab
Google Collaboratory powered by Google is a Jupyter notebook IDE with access to unlimited free CPU. It also comes with limited free GPU and TPU. All you need is a Google Account to get started. Google Colab allows you to mount your Google Drive as a storage folder for your projects, and the free version of Google Drive comes with 15GB of storage.
How to use Google Colab?
- Go to Google Collaboratory
- Create a new notebook
- To use data that are already stored in your Google drive, click on the “Mount Drive” icon in the left panel
4. After the drive is mounted, we will see a drive folder in the directory. This is your google drive directory.
Image by Author5. Read the csv file
import pandas as pddf = pd.read_csv('/content/drive/path/to/data.csv')
Share Notebook
To share notebooks click on share button located at the top right of the notebook.
Image by AuthorKaggle
Kaggle offers Kaggle notebooks with unlimited CPU time and limited GPU time. There is a rich repository of datasets on Kaggle which you can start using by addint it to your Kaggle notebook.
How to use Kaggle notebooks
- Create a Kaggle account
- Create a notebook
3. Select a dataset using the Add data button. We can upload our own dataset or use an existing Kaggle dataset.
Image by AuthorShare Notebook
To share notebook click on the share button on the top right and make the notebook public.
Conclusion
We explored 3 different options for cloud hosted Jupyter notebook with free and unlimited CPU runtime. These platforms offers RAM of 12GB — 15GB making it suitable for training classical machine learning models for small to medium size datasets. If you are training deep learning models it is recommended to use GPUs instead of CPUs. Check out my other article on free cloud GPUs for training your deep learning model.