A note on encryption for those interested¶

We use the ssh program to connect to the AWS cluster. Using ssh is much more secure than using passwords – and, in fact, passwords are disable altogether for the cluster. You can only connect to the cluster with ssh. The ssh client-server system fills two roles for us. First, it provides a secure mechanism to authenticate – to establish your identity to the system so that you can login and access the privileges associated with your account. Second, ssh encrypts the transmission between your client and the server you connect to.

Ssh is built on public-private key cryptography. Public-private key cryptography is asymmetric – meaning one key is used to encrypt data and another is used to decrypt it. One of the keys is denoted the “public” key and one the “private” key. The keys are strings of bits (and related to each other) such that the public key can be used to encrypt a message and the private key can be used to decrypt it. The public key can be widely disseminated and anyone can use it to encrypt a message. But, the holder of the private key does not share that – and only that key can be used to decrypt the message.

Since the bit sequences in the private and public key used by ssh are obviously related to each other in a very deep way (one encrypts, one decrypts), one might ask whether the private key can be derived somehow from the public key. One could imagine, for instance, encrypting many known messages with the public key and comparing them to the original message in order to reverse-engineer the private key. Fortunately, as far as we know, this is not possible (at least not with classical computers). The difficulty of doing so is exponential in the number of bits in the keys. With a sufficient number of bits, the task becomes one that could not be completed until many lives of the current universe have come and gone. 1

Since public-private key encryption is relatively expensive compared to symmetric schemes, ssh does not necessarily use your keys for the actual encryption of the channel, but rather uses them to securely exchange a symmetric key between client and server. The symmetric key is then used for encryption and decryption.

The ssh program itself works by keeping your private key securely on your client computer and then putting your public key onto any computer you would like to log into. The exact mechanisms vary depending on the ssh client and server as well as on the OS being used. In the Linux server that you will be connecting to, your public key is stored in a file named authorized_keys that is kept in the .ssh subdirectory in your home directory. Once you obtain your private key from me and import it into the ssh client of your choice, you will be able to connect directly to the clusters (I will install the public key into your account).

Transmitting a Private Key¶

So now there is a slight problem. I have generated your public-private key pair and the public key does not have to be kept securely. But, your private key does. Private keys always need to be distributed in some kind of secure fashion. Email won’t cut it (email is the worst possible way to send anything sensitive). One can leverage existing secure communication mechanisms – but then that needs to be set up in some kind of secure fashion so that the information is only available to the intended recipient, like, say ssh – but that just pushes the problem.

How secure keys are distributed will vary from situation to situation. If you generate secure key pairs on the AWS web site, Amazon will keep your public key, but your private key will be downloaded to your computer over a secure https connection. In settings where one agent creates keys for another, getting the keys to the recipient may be done with encrypted USB drives – but, again, there is some security/encryption involved to make that transfer. And that security also requires establishing identity of the recipient.

Fortunately there are some existing mechanisms to leverage at UW, in particular your UW netid. Your UW netid and associated password are used in a number of services at UW – email, web services, network drives, shared folders, and so forth. When a service is requested for a particular netid and is presented with the correct password, then access is granted. Doing this in a consistent and coherent way across an entire enterprise can be a challenge. Most large (or even small) organizations use some kind of central authentication service. Rather than having each service on campus have its own authentication mechanisms, they all – securely – check credentials with a central server (the central authentication service (CAS)). This is one of those things that “just works” – and you can use your same netid and login at websites in different departments, for your Husky card, and even for some affiliated external services such as Microsoft Office (including OneDrive). This is rather remarkable – you never created an account on any of the distinct and separated services that you use around campus – but yet you can use them with the same netid and password. 2

Accordingly, I will be leveraging OneDrive, which requires you to authenticate with your UW netid to access your keys.

1: This does pre-suppose \(P\neq NP\), which has not been proven or disproven.
2: There is an interesting existential question here. Are you authenticating when you enter your netid and your password? Is that transaction proving to the system that you are who you say you are? Or is your identity illusory? Is the system just responding to presentation of credentials without regard to the “identity” of the presenter?