How to forward a port through a jump box
When you have set up Apache Spark and use Jupyter to run analyses on it, you’ll need to connect to the Jupyter notebooks by forwarding the port the notebooks run on to your local machine.
Depending on how the server that runs Spark is secured, you might need to do that through a “jump box”, a server that is hardened to prevent unauthorized access and let’s you access a network that’s otherwise not directly accessible from the Internet.
If you’re as untrained in using ssh
as I am, it can be a bit frustrating to set that up yourself because it’s not entirely obvious when googling around. In the tradition of writing things up so that I don’t have to google them over and over again, here’s how to do it.
The first thing to know: there’s a file called ~/.ssh/config
where you can “store” ssh
connections instead of typing them in manually all the time. That’s what makes it possible to type ssh my-server
and access your server instead of ssh my-username@my-host-address -i /path/to/ssh-key-file
. Blew my mind when I learned this.
So, open ~/.ssh/config
in your editor of choice, then add the following:
Host jump-box
HostName jumpbox.yourdomain.com
User your-user-name-on-jump-box
IdentityFile /Users/local-user-name/.ssh/ssh_key_file_for_jump-box
ForwardAgent yes
Host jupyter-box
User your-username-on-jupyter-server
ForwardAgent yes
ProxyCommand ssh -q jump-box nc address-of-jupyter-box 22
IdentityFile /Users/local-user-name/.ssh/ssh_key_for_jupyter-box
When you’ve added this to your ~/.ssh/config
file, all you need to do to connect to the protected server and forward the port to access your Jupyter notebooks is:
$ ssh -L 8889:localhost:8889 jupyter-box
In this case, we’re forwarding port 8889
, which is the port that my Jupyter notebooks are running on.
Tada, done!
Special thanks to this Unix Stackexchange discussion and users Naftuli Kay and Celada for writing up their solutions 💪