WPCloud site SSH and SFTP

There are two classes of connections via SFTP and SSH to sites hosted on the Atomic Platform. We have client connections and we have end user connections. These classes of connections are treated, configured, and accessed differently. So we will have to cover both of those cases below, but we will do them separately, and possibly repeat some information between the two in the process.

# Client Connections

Client connections are made to {atomic-site-id}@client-ssh.atomicsites.net. These connections are the most important type of connections to understand because client level access gives access to every single site owned by that client on the Atomic platform.

Authentication

Client connections only support one main type of authentication: public key authentication. Password authentication is completely disabled for the client SSH service as it is not considered secure enough for a credential that allows access to all sites on the client. This is accomplished in two main ways.

Aliasable Public Keys: The preferred method for clients to manage their public keys is by creating aliasable public keys. With this endpoint you can upload a public key to the Atomic platform as pub://clientname/category?keyname So you might have pub://clientname/employees?tina and/or pub://clientname/users?12345 and/or pub://clientname/automation?workerservers. Then any place where we ask for a public key you can use those identifiers instead of the actual public key. This has several advantages. They’re list-able. You can automate adding, updating, and removing them. When you update the public key behind an alias it affects all places that alias is referenced by simultaneously. And if you remove an aliasable key it then instantly stops working in all places referencing the now missing address.

Actual Public Keys: Alternatively you can send an authorized_keys formatted line in any place we ask for a public key. The disadvantage of this is that you cannot affect all instances of the use of this key across the entire system with one call – it would be up to you to track and update this yourself.

Additional Security

Because of the broad level of access provided by client connections Atomic requires that this class of connection come from known allowed IP Addresses which need to be communicated to the Atomic team before connections will be allowed to this service.

Valid IP Examples: Proxy servers through which your employees will make their connections to the Atomic service. Automation servers which will be doing things on behalf of your service or your clients. Bastion hosts from which your employees will be connecting. An office’s static IP address from which your employees do their work.

Invalid IP Examples: Employees home internet service. Internet service with a dynamic IP Address. Commercial VPN services. Favorite Coffee shop WIFI.

Managing Client SSH Access

To configure access you will use the client-authorized-keys API endpoint. Essentially you will be adding and removing lines from a virtual authorized_keys file. It is highly recommended that you use aliasable public keys for these values.

Once the above is complete you should be able to have full SSH access to all of the sites owned by your client id. The access is full access regardless of restrictions set at the platform, site, or client level on general access type. That is: if your client is configured to only allow SFTP access to sites, a client-ssh.atomicsites.net connection will allow full SSH access. That is not to say that only shell access is permitted. You can still connect to these sites via SFTP via the client SSH service as you would expect using a properly configured credential.

Authentication data is NOT shared between the client-ssh.atomicsites.net and ssh.atomicsites.net. So client credentials cannot be used to log into the end user SSH service and end user credentials cannot be used to log into the client SSH service.

# End User Connections

Authentication

End User connections support two methods of authentication. These connections give access to only a single site, are generally connected to by less technically savvy people/services, and need to be more generally accessible and friendly than client connections. As such there are fewer hoops to jump through and the security restrictions for authentication methods is more lax than the client wide counterpart.

Aliasable Public Keys: The preferred method for clients to manage their public keys is by creating aliasable public keys. With this endpoint you can upload a public key to the Atomic platform as pub://clientname/category?keyname So you might have pub://clientname/employees?tim and/or pub://clientname/users?12345 and/or pub://clientname/automation?workerservers. Then any place where we ask for a public key you can use those identifiers instead of the actual public key. This has several advantages. They’re list-able. You can automate adding , updating, and removing them. When you update the aliasable public key behind an alias it affects all places that reference it simultaneously. And if you remove an aliasable key it then instantly stops working in all places referencing the now missing alias. With aliases you can define credentials for a user, and have that same credential effective for all of that users sites.

Actual Public Keys: Alternatively to aliases you can send an authorized_keys formatted line in any place we ask for a public key. The disadvantage of this is that you cannot affect all instances of the use of this key across the entire system with one call – it would be up to you to track and update this yourself.

Password Authentication: This is the most common and most self explanatory method of authentication.

Configuring Allowed Access Methods

As a service Atomic, by default, currently restricts sites to SFTP only access. That means when an end user tries to connect via SSH for a shell, or a remote command, they will receive an error stating that only SFTP access is allowed.

As a client you may request a different default which supersedes the service default. That is: you may request us to allow full SSH by default for all of the sites on your service. These requests must currently be made to the Atomic team directly. Of importance is that this is a dynamic default. If and when it is changed it affects any and all sites not specifically set to a setting at the site level.

You may also configure an access method for each individual site via the site-set-access-type API endpoint which supersedes both the platform default and also the client configured default if applicable.

Managing End User Credentials

End user connections are made to {ssh-user-username}@ssh.atomicsites.net or to a CNAME alias to ssh.atomicsites.net.

End user access to a site is managed via the ssh-user endpoint and its arguments add, update, remove, and list. Even though the endpoint is named ssh-user this does not affect the type of access allowed for the site. So, if the site is set to SFTP only, an ssh-user will only be able to SFTP into the site.

An SSH user with a password is, obviously, allowed access to the specific site with that username and password.

An SSH user without a password will not be asked for a password when connecting to the service and is therefor useless without a configured public key. It is highly recommended that you use aliasable public keys for public keys since this will make users rotating their public keys considerably easier for you to manage. for security purposes we generally recommend but do not enforce, that an account should have a public key (preferred), or a password, but not both.

# How the service works

Short Answer: It’s a proxy.

Longer Answer: It’s a complicated proxy.

Why this is so complicated?

Both the Client and End User SSH service work in, generally, the same way. Since sites are spread out amongst and arbitrarily large number of pools, each with their own set of primary and secondary servers the onus of finding the right server to connect to, connecting to it, etc, is an entirely unreasonable burden to place on you as a client and doubly so for end users wishing to connect to their site.

There would be additional headaches for you in key management inside of an authorized_keys file. And would cause complications when moving sites to other pool servers (which happens on occasion for things like hardware upgrades.)

Why not use DNS to direct connections?

There are several problems with using DNS to direct connections to the correct server.

Caching: Many DNS servers cache responses for an arbitrarily long period of time. They do this regardless of the settings on the domain. So when this happens we, you, and the end user have no recourse but to wait out the cache expiry.

Moving Parts: Even if we were able to set a low TTL and have it honored by DNS resolvers the world over – which we can’t – we would need to set the TTL so low as to effectively disable all caching entirely for the service. This is because of cases of a site being migrated between servers, or a server fail-over event.

Discoverability: If we did this then it would be possible for an attacker to discover all of the servers handling Atomic platform requests. And discover all sites on a given pool server. And use this information to launch targeted attacks on the pool server (directly, or via other sites on the same host) targeting a given site.

An SSH Proxy Service

To this end we have implemented a custom SSH proxy service for you and your end users to connect through.

On connection the username (which is an atomic-site-id for client access, and an ssh-user’s username for end user access) is translated into an atomic-site-id if necessary and the pool servers on which that site is located are identified.

Authentication is preformed and a connection is established with the pools SSHD server. A lot of complicated magic takes place here regardless of how simple this sounds at first.

Containers: finally a good use case!

Once the connection has been made to the SSHD service and you open a channel (SFTP, Shell, or a command run directly) a docker container is spun up. Inside this docker container your requested channel is run.

The reason for this is to protect all of the sites hosted on the Atomic platform from bad actors, or mistakes made by people, on other sites on the server. With this setup we can strictly control what a user is or isn’t allowed to see, do, find, etc.

# Implementation Details

SFTP Home: When connecting via SFTP the session will always begin in the users htdocs directory (/srv/htdocs) to maintain backwards compatibility with how access worked before SSH access was reconsidered as an end user facing feature.

SSH/Exec Home: When logging in via a full shell or when directly executing a command the session will always start in /home/{atomic-site-id} with that directory set as the $HOME environment variable.

Why different Homes: Originally the Atomic platform never intended to allow end users full ssh access to their sites. And as such no HOME was needed (thus htdocs being the initial starting point for SFTP sessions.) Once that was reconsidered we faced a dilemma: many tools require or make files in the $HOME directory (often “dotfiles” or “dotdirectories” like .vimrc, but sometimes not) and we definitely did not want to present end users with an environment where they might accidentally leak information by creating these files in a web accessible location. So now shell sessions do have a specific and unique home.

$HOME and HTTP: And that home is not only not web accessible but actually not present in the context of a web request (meaning a site compromised via an HTTP request cannot access any data in the sites $HOME.) Therefor if you intend to create data in the context of an SSH connection and it is meant to be used by code executed in a web request that data must be stored in tmp or in htdocs.

Quoted Arguments in Directly Executed Commands: It is possible to directly execute a command via ssh like so: ssh {user}@{host} date which should connect, run the date command echoing the output to your terminal, and disconnect. However complex arguments are trickier in this context because of how these commands are treated and translated by SSH. This means that you cannot ssh {user}@{host} wp eval 'echo "hello world\n";' because the single quotes (in this case) are lost by SSH in the particular configuration that allows all of this to happen. To accomplish this you end up needing to place outer quotes around the entire command to ensure that the inner quotes are not lost, so: ssh {user}@{host} "wp eval 'echo \"hello world\n\";'" works as intended.

# Limitations

Because we want to protect the functioning of your site from other users abusing the service (by accident, or on purpose) and also to protect those sites from accidental abuse from you we have several limitations baked into how the service works.

Wall Clock Time: No session is allowed to execute for more than 8 hours. At 8 hours whatever is happening will be automatically terminated.

Memory Usage: No session is allowed to use more than 1GB of RAM, total, for the processes running inside of that session.

Process Limits: No session is allowed to spawn more than 25 processes. This one can, sometimes, surprise people. Take, for instance a bash script which does something like wp command | grep something | cut -f1 | sort | uniq this is actually 8 processes: one for your login shell, one for the bash script, one for wp (assuming it does not spawn more processes), one for grep, another for cut, another for sort, and finally one for uniq. The purpose of this is to prevent both resource denial and also inadvertently recursive code.

Disconnected Processes: While you can background processes executed while logged in via a shell you cannot leave processes running and disconnect from the services and expect those processes to stay running. Once a session is disconnected from all of its processes are killed. this is true regardless of method used such as nohup, etc.

Concurrency: End user connections to a given site are limited to 10. The username, in this case, is irrelevant. Ten connections from a single username is just the same as one connection from ten usernames. Client connections to a site are not, currently, limited in this way and also do not count towards end user connection concurrency limits. So if a site already has 10 user connections clients may still connect, and if a site has 10 client connections then 10 end user connections may still be made.

An    Automattic   venture

Work With Us