Again a nice post by Eric Hammond . Hope its useful for some of you out there…
Amazon Web Services is such a huge, complex service with so many products and features that sometimes very simple but powerful features fall through the cracks when you’re reading the extensive documentation.
One of these features, which has been around for a very long time, is the ability to use AWS to seed (serve) downloadable files using the BitTorrent™ protocol. You don’t need to run EC2 instances and set up software. In fact, you don’t need to do anything except upload your files to S3 and make them publicly available.
Any file available for normal HTTP download in S3 is also available for download through a torrent. All you need to do is append the string
?torrent to the end of the URL and Amazon S3 takes care of the rest.
Let’s walk through uploading a file to S3 and accessing it with a torrent client using Ubuntu as our local system. This approach uses
s3cmd to upload the file to S3, but any other S3 software can get the job done, too.
- Install the useful
s3cmdtool and set up a configuration file for it. This is a one time step:
sudo apt-get install s3cmd s3cmd --configure
The configure phase will prompt for your AWS access key id and AWS secret access key. These are stored in
$HOME/.s3cmdwhich you should protect. You can press [Enter] for the encryption password and GPG program. I prefer “Yes” for using the HTTPS protocol, especially if I am using s3cmd from outside of EC2.
- Create an S3 bucket and upload the file with public access:
bucket=YOURBUCKETNAME filename=FILETOUPLOAD basename=$(basename $filename) s3cmd mb s3://$bucket s3cmd put --acl-public $filename s3://$bucket/$basename
- Display the URLs which can be used to access the file through normal web download and through a torrent:
cat <<EOM web: http://$bucket.s3.amazonaws.com/$basename torrent: http://$bucket.s3.amazonaws.com/$basename?torrent EOM
- The above process makes your file publicly available to anybody in the world. Don’t use this for anything you wish to keep private.
- You will pay standard S3 network charges for all downloads from S3 including the initial torrent seeding. You do not pay for network transfers between torrent peers once folks are serving the file chunks to each other.
- You cannot throttle the rate or frequency of downloads from S3. You can turn off access to prevent further downloads, but monitoring accesses and usage is not entirely real time.
- If your file is not popular enough for other torrent peers to be actively serving it, then every person who downloads it will transfer the entire content from S3’s torrent servers.
- If people know what they are doing, they can easily remove “?torrent” and download the entire file direct from S3, perhaps resulting in a higher cost to you. So as a work-around just download the ?torrent URL, save the torrent file, and upload it back to S3 as a .torrent file. Share the torrent file itself, not the ?torrent URL. Since nobody will know the URL of the original file, they can only download it via the torrent.You don’t even need to share the .torrent file using S3.SOURCE
I have not tested this personally, but seems to be a correctly put by Eric Hammond . If you try, do let me know if you find any catchs. 🙂
The ssh protocol uses two different keys to keep you secure:
- The user ssh key is the one we normally think of. This authenticates us to the remote host, proving that we are who we say we are and allowing us to log in.
- The ssh host key gets less attention, but is also important. This authenticates the remote host to our local computer and proves that the ssh session is encrypted so that nobody can be listening in.
Every time you see a prompt like the following, ssh is checking the host key and asking you to make sure that your session is going to be encrypted securely.
The authenticity of host 'ec2-...' can't be established. ECDSA key fingerprint is ca:79:72:ea:23:94:5e:f5:f0:b8:c0:5a:17:8c:6f:a8. Are you sure you want to continue connecting (yes/no)?
If you answer “yes” without verifying that the remote ssh host key fingerprint is the same, then you are basically saying:
I don’t need this ssh session encrypted. It’s fine for any man-in-the-middle to intercept the communication.
Ouch! (But a lot of people do this.)
Note: If you have a line like the following in your ssh config file, then you are automatically answering “yes” to this prompt for every ssh connection.
# DON'T DO THIS! StrictHostKeyChecking false
Care about security
Since you do care about security and privacy, you want to verify that you are talking to the right server using encryption and that no man-in-the-middle can intercept your session.
There are a couple approaches you can take to check the fingerprint for a new Amazon EC2 instance. The first is to wait for the console output to be available from the instance, retrieve it, and verify that the ssh host key fingerprint in the console output is the same as the one which is being presented to you in the prompt.
Scott Moser has written a blog post describing how to verify ssh keys on EC2 instances. It’s worth reading so that you understand the principles and the official way to do this.
The rest of this article is going to present a different approach that lets you in to your new instance quickly and securely.
Passing ssh host key to new EC2 instance
Instead of letting the new EC2 instance generate its own ssh host key and waiting for it to communicate the fingerprint through the EC2 console output, we can generate the new ssh host key on our local system and pass it to the new instance.
Using this approach, we already know the public side of the ssh key so we don’t have to wait for it to become available through the console (which can take minutes).
Generate a new ssh host key for the new EC2 instance.
tmpdir=$(mktemp -d /tmp/ssh-host-key.XXXXXX) keyfile=$tmpdir/ssh_host_ecdsa_key ssh-keygen -q -t ecdsa -N "" -C "" -f $keyfile
Create the user-data script that will set the ssh host key.
userdatafile=$tmpdir/set-ssh-host-key.user-data cat <<EOF >$userdatafile #!/bin/bash -xeu cat <<EOKEY >/etc/ssh/ssh_host_ecdsa_key $(cat $keyfile) EOKEY cat <<EOKEY >/etc/ssh/ssh_host_ecdsa_key.pub $(cat $keyfile.pub) EOKEY EOF
Run an EC2 instance, say Ubuntu 11.10 Oneiric, passing in the user-data script. Make a note of the new instance id.
ec2-run-instances --key $USER --user-data-file $userdatafile ami-4dad7424 instanceid=i-...
Wait for the instance to get a public DNS name and make a note of it.
ec2-describe-instances $instanceid host=ec2-...compute-1.amazonaws.com
Add new public ssh host key to our local ssh known_hosts after removing any leftover key (e.g., from previous EC2 instance at same IP address).
knownhosts=$HOME/.ssh/known_hosts ssh-keygen -R $host -f $knownhosts ssh-keygen -R $(dig +short $host) -f $knownhosts ( echo -n "$host "; cat $keyfile.pub echo -n "$(dig +short $host) "; cat $keyfile.pub ) >> $knownhosts
When the instance starts running and the user-data script has executed, you can ssh in to the server without being prompted to verify the fingerprint
Don’t forget to clean up and to terminate your test instance.
rm -rf $tmpdir ec2-terminate-instances $instanceid
There is one big drawback in the above sample implementation of this approach. We have placed secret information (the private ssh host key) into the EC2 user-data, which I generally recommend against.
Any user who can log in to the instance or who can cause the instance to request a URL and get the output, can retrieve the user-data. You might think this is unlikely to happen, but I’d rather avoid or minimize unnecessary risk.
In a production implementation of this approach, I would take steps like the following:
- Upload the new ssh host key to S3 in a private object.
- Generate an authenticated URL to the S3 object and have that URL expire in, say, 10 minutes.
- In the user-data script, download the ssh host key with the authenticated, expiring S3 URL.
Now, there is a short window of exposure and you don’t have to worry about protecting the user-data after the URL has expired.
FUSE-based file system backed by Amazon S3.
S3fs is a FUSE filesystem that allows you to mount an Amazon S3 bucket as a local filesystem. It doesn’t store anything on the Amazon EC2, but user can access the data on S3 from EC2 instance, as if a network drive attached to it.
S3fs-fuse project is written in python backed by Amazons Simple Storage service. Amazon offers an open API to build applications on top of this service, which several companies have done, using a variety of interfaces (web, rsync, fuse, etc).
These steps are specific to an Ubuntu Server.
- Launch an Ubuntu Server on AWS EC2. (Recommended AMI – ami-4205e72b, username : ubuntu )
- Login to the Server using Winscp / Putty
- Type below command to update the existing libraries on the server.
sudo apt-get update
4.Type command to upgrade the libraries. If any msg is prompted, say ‘y’ or ‘OK’ as applicable.
sudo apt-get upgrade
Once upgrade is complete, install the necessary libraries for fuse with following command
sudo aptitude install build-essential libcurl4-openssl-dev libxml2-dev libfuse-dev comerr-dev libfuse2 libidn11-dev libkadm55 libkrb5-dev libldap2-dev libselinux1-dev libsepol1-dev pkg-config fuse-utils sshfs
If any msg is prompted, say ‘y’ or ‘OK’ as applicable.
5. Once all the packages are installed, download the s3fs source (Revision 177 as of this writing) from the Google Code project:
6.Untar and install the s3fs binary: (Run each command individually)
tar xzvf s3fs-r177-source.tar.gz
sudo make install
7. In order to use the allow_other option (see below) you will need to modify the fuse configuration:
sudo vi /etc/fuse.conf
And uncomment the following line in the conf file: ( To uncomment a line, remove the ‘#’ symbol )
Save the file using command: ‘Esc + : wq ’
8. Now you can mount an S3 bucket. Create directory using command :
sudo mkdir -p /mnt/s3
Mount the bucket to the created directory
sudo s3fs bucketname -o accessKeyId=XXX -o secretAccessKey=YYY -o use_cache=/tmp -o allow_other /mnt/s3
Replace the XXX above with your real Amazon Access Key and YYY with your real Secret Key.
Command also includes instruction to cache the bucket’s files locally (in /tmp) and to Allow other users to be able to manipulate files in the mount.
Now any files written to /mnt/s3 will be replicated to your Amazon S3 bucket.
WinScp – Verify mount directory
Check the wiki documentation for more options available to s3fs, including how to save your Access Key and Secret Key in /etc/passwd-s3fs.