I've been running an instance of Jupyter notebook on AWS to gain access to some more powerful resources for my NN models for my master thesis. I used Chris Albon's fantastic guide to do the following.
High Level Summary
In broad strokes, here is what you need to do to get started.
- Start an EC2 Instance
- Set up security key for EC2 Instance
- SSH into your EC2 Instance
- Install Python
- Set up Jupyter
- Access notebook
1. Start an EC2 Instance
Go to Amazon AWS and sign up or log in.
Navigate to the EC2 Dashboard and Launch Instance.
Next, you can choose any Linux based OS (Amazon Linux, Ubuntu, etc.). If you're brand new, you can choose Ubuntu.
On this next page "Step 2: Choose an Instance Type" you can choose any type of instance. In this tutorial, I'll choose the micro tier (which is free tier eligible). You will probably need more firepower to run train complex algorithms, but you can adjust the type later.
Follow through with the review and finally click "Launch."
2. Set up security key for EC2 Instance
After clicking "Launch" you will be prompted with a few security key options.
If you're familiar with this, you can skip down to the next step.
For those who don't know what this is, a PEM file is a key that AWS will check when you try to access your EC2 instance via SSH. Since you don't have one yet, let's create a new pair.
Save your PEM file in a secure place, anyone with access to that file can theoretically get into your EC2 instance.
Congrats! You now have a working EC2 instance (well, it usually takes a minute to get up and running).
3. SSH into your EC2 Instance
Once you're back on your dashboard, take note of what the
Public DNS (IPv4) address is for your instance. It should read something like
ec2-123-45-678-999.compute-1.amazonaws.com . This is the public hostname of the instance, which resolves to the public IP address or Elastic IP address of the instance.
By far the easiest and most convenient way for most beginners will be to connect directly through AWS in their browser.
Connect and select
A Java SSH Client directly ... option and click
Launch. Make sure to include the correct path to your PEM file.
Congrats! You're in! Now skip the
For Windows and
For Mac section and head straight to
4. Install Python and set up Jupyter.
If you're on Windows, you might not be familiar with the terminal (like bash or zsh) and don't have access to one natively. If you are familiar and have access to one, refer to the Mac section. If you aren't, it's time to download PuTTY, which can be downloaded here.
Once installed, you need to open PuTTYgen, which will be accessible through your Windows key (press your Window key and search for PuTTYgen and it should pop up).
Load and load up your PEM file. After the prompt, click
Save private key and save the PPK file. All this KeyGen did was convert your key into a usable format for PuTTy.
Now load up PuTTy.
We first have to load up your brand new PPK file. To do that navigate on the side panel
Connection > SSH > Auth. Browse and open your PPK file from where you saved it.
Navigate back to
Session and paste in your
Public DNS for your EC2 instance. Click
Great! Now skip the
For Mac section and head straight to
4. Install Python and set up Jupyter.
Open your terminal and type in the following:
$ ssh -i "/PATH/TO/keypair.pem" email@example.com
/PATH/TO/keypair.pem with the path and name of your key pair that you downloaded earlier, and replacing
ec2-xxxxxxxxx-xx.compute-1.amazonaws.com with your own Public DNS.
Hint: type "pwd" into your terminal if you’re unsure what your present working directory is
If you get an error that your PEM file is not publicly viewable, you made need to execute this command:
$ chmod 400 /PATH/TO/keypair.pem
4. Install Python
Now that you're in, you should be seeing something like
Now we need to install Anaconda. We can download it by using the following command:
$ wget https://repo.anaconda.com/archive/Anaconda3-2018.12-Linux-x86_64.sh
Once it finishes downloading, install Anaconda using the following command:
$ bash Anaconda3-2018.12-Linux-x86_64.sh
Next, you'll be prompted through a lot of questions and eventually install Anaconda3. At the end, you'll be asked to include Anaconda3 into your .bashrc PATH. Make sure to type
(If you accidentally pressed enter before typing no, take a look at the bottom of this post at "Extra" to fix it!)
Now, let's set up Aanaconda3 as your default Python environment. Depending on which image you started off with, youur instance might be configured to use the system's Python 2.7. To switch your environment to what we just installed, type out the following two commands:
$ which python /usr/bn/python $ source .bashrc
5. Set up Jupyter
In this step, we are looking to do – things.
First, we set up a password for your Jupyter Notebook so you can access it via brower privately. Then, we have to take the SHA version of said password and create a certificate so you can access your notebooks from your local computer through your broswer.
We first access the Ipython console with the following:
This will now show you the prompt
In :. We want to create a password, so we import a password module:
In : from IPython.lib import passwd
In : passwd()
Type in your password and verify. Remember this password. After that you'll be given the SHA version. Store this password for later. Now, type in
exit to exit.
Next, we are going to generate SSL certificate so our browser will trust the Jupyter server. Start with:
$ jupyter notebook --generate-config
Then the following two commands to make and access your new directory:
$ mkdir certs $ cd certs
Then we create a new PEM file (different from the PEM file stored localled to access AWS):
$ sudo openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem
This certificate is good for 365 days. You'll be asked for a bunch of personal information, but its okay to just press enter through it and not provide anything.
Now, we have to adjust the Jupyter configuration files with the new certification we created. To get started, let's get back out to the home directory:
Now, we'll open Vim again and edit the config file created earlier:
$ vim .jupyter/jupyter_notebook_config.py
Again, I'll leave you and Google to figure out how to type/paste stuff into the file using Vim. But before you do, you'll notice the entire file is commented out, so feel free to put the following lines anywhere (making sure your replace the password with what you stored earlier):
c = get_config() # Kernel config c.IPKernelApp.pylab = 'inline' # if you want plotting support always in your notebook # Notebook config c.NotebookApp.certfile = u'/home/ec2-user/certs/mycert.pem' #location of your certificate file c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False c.NotebookApp.password = u'sha1:262xxxxxxxxxxxxxxx65f' c.NotebookApp.port = 8888
Finally, from your home directory, create a folder for your new notebooks and start the Jupyter notebook from inside:
$ mkdir Notebooks $ cd Notebooks $ jupyter notebook
6. Accessing Jupyter Notebooks
Now we have to set up the proper security rules for your EC2 instance. Go back to your EC2 Dashboard and look at the instances you have running. If you scroll to the far right, you can see the column
Security Groups, click on the associated security group of your instance (it should read something like
Now add another line in the
Inbound tab. Make sure your port range includes 8888 (as we set up earlier).
Save it. Now open up your browser and put in your Public DNS for your EC2 instance and it should aslo include
https:// at the beginning and
:8888 at the end. For instance,
If you accidentally hit enter before typing yes, it defaults to no. So now, you’ll have to manually type the PATH into your .bashrc file. To do this type:
Vim is a text editor (just like Notepad on Windows or Notes on Mac). But, if this is your first time, it might seem confusing. I'll let you do some googling and figure out how to do the following.
Once you open up
.bashrc you'll have to add the following line at the bottom of the file:
Save and exit!