Benjamin Esham

Building and hosting a static site with Jekyll and Amazon S3 on Mac OS X

Welcome to my new site! In this inaugural article, I’m going to describe how I set up the Jekyll static site generator on Mac OS X and how I configured Amazon S3 to host the website. Specifically, I’ll cover the following topics:

There were a number of quirks and unexpected behaviors I ran into when I went through this process, so my hope in writing this article is that you’ll be able to avoid those annoyances.

Contents

Background

A couple of months ago I wanted a place to host my PGP key signing policy, so I set up a very basic GitHub Pages–powered site that consisted of nothing but that and a list of my GitHub projects. GitHub Pages has built-in support for Jekyll, a static site generator written in Ruby, so I adapted the site to use Jekyll — more out of academic interest than any real need, with only two pages on my site!

Eventually I capitulated and decided to create a full-fledged website for myself, mostly to be a repository for physics things I’d like to publish and as a “focal point for my online identity” (read: list of links to social-networking sites). Ironically, as I used Jekyll more and more heavily I decided that Amazon S3 would be a better choice as a hosting service.

First things first: Git

Each website hosted on GitHub Pages is (naturally) stored in a Git repository. Although this is obviously not necessary if you’re not using GitHub Pages, I’d strongly recommend keeping any website you develop in a Git (or other SCM) repository. It’ll allow you to view previous versions of your site, split tentative new features or designs into independent branches, and generally keep better track of what you have and what you’ve been doing with it.

Installing and using Git is outside the scope of this article, but briefly: on Mac OS X you can install it easily with Fink or Homebrew, and you can find good tutorials at Git Immersion, Git Magic, and GitHub itself.

RVM and tcsh

I didn’t (and don’t) have much familiarity with Ruby, but it seems like the Ruby community has a really well-polished and universally-used package manager called RubyGems. It also seems like Apple has made this tool kind of worthless by botching the default Ruby configuration in Mac OS X. The Apple-supplied version of RubyGems is too old to install Jekyll, and after I updated RubyGems and installed Jekyll the resulting setup was weird: running Jekyll (or any other Gem, like SASS) required sudo, and consequently the generated website was owned by root. These were not huge issues, but eventually I found a way to avoid them altogether.

RVM, the Ruby Version Manager, was originally intended to help developers test their Ruby code against different versions of Ruby and different combinations of Gems.1 It seems to have since become the preferred way to install Ruby no matter what you’re doing, and it’s the way I managed to get Ruby installed on my machine.

Following the instructions on the RVM site, run the following command in your shell:

\curl -sSL https://get.rvm.io | bash -s stable

(The backslash ensures that curl will be called in a vanilla configuration, even if “curl” is normally aliased to something else.) Next, copy the contents of this Gist into a file named “rvm.rb” that lives somewhere in your PATH. I have a “bin” directory right in my home folder so I put it there; you could also use /usr/local/bin or even /usr/bin. Make the file executable using chmod u+x rvm.rb.

Finally, put this line in your tcshrc:

alias rvm 'eval `~/bin/rvm.rb \!*`'

Change the path to “rvm.rb” to reflect where you put the file. Once you restart your shell, the “rvm” command will work exactly the way it’s supposed to.

Ruby

Now that RVM is installed, we can install and select the latest version of Ruby (as of this writing) with

rvm install 2.1

Now you can type

rvm use 2.1

to enter the “RVM environment” — any Ruby commands you run (ruby, irb, gem, etc.) will take place within the context of this environment. That means they’ll use the version of Ruby you just installed. It also means that any Gems you install within this environment will not be visible to the system-provided Ruby. (Those Gems will still be there, however, when you leave and reenter the RVM environment.)

Note that you’ll need to run “rvm use 2.1” every time you open a new shell if you want to take advantage of the Ruby and Gems you’ve installed. You might want to put that command in your tcshrc so you don’t have to type it every time.

Finally, Jekyll

Phew! We’re finally set up to install Jekyll. Fortunately it’s now as simple as

gem install jekyll

at the command line. With that done, I won’t belabor the care and feeding of Jekyll itself, but its documentation is excellent, and in particular I found it helpful to look at the source code of sites that use Jekyll already.

Setting up S3

Disclaimer: Amazon S3 costs money. There is a free usage tier, but please don’t blindly follow these instructions without knowing what you’re signing up for.

So now you’ve got a website, courtesy of Jekyll, and you need to host it somewhere. As I said, I started at GitHub Pages, but after a while I became accustomed to using my own custom plugins with it. (For example, I use the SASS stylesheet language, which I highly recommend, and I use the converter from this Gist to have Jekyll run SASS automatically.) A Jekyll-generated site consists only of static files, and the cheapest and most reliable host I found for this purpose was the Amazon Simple Storage Service (S3). (Although some of Amazon’s Web Services offerings have had notable outages in the past, S3 is used by such big names as Dropbox and Tumblr as their underlying storage service.) Thanks to some changes Amazon made at the beginning of 2011, S3 is almost perfect for this purpose.

To set up S3 as your web host you’ll need to do the following:

  1. Create a bucket to hold your website.
  2. Set the bucket permissions so that your website is properly accessible.
  3. Tell S3 that this bucket is supposed to be a website.

To use Amazon S3 you’ll need an Amazon Web Services account, which shares credentials with your regular Amazon account but requires an extra signup procedure. Once you’ve gone through that, log in to the “AWS Management Console” at the preceding link, and then going to the “Amazon S3” tab. Click on “Create Bucket” and create a bucket whose name is your soon-to-be website’s fully-qualified domain name: mine is www.bdesham.info, for instance; yours might be www.foo.com or web.xyzzy.com. For reasons we’ll get to later, if you want an “apex” or “naked” domain name like example.com, you need to use the bucket name www.example.com instead; later you’ll be able to get rid of the “www”.

Next you need to set the bucket permissions. By default, the contents of S3 buckets are viewable only by their owners, but obviously a public website needs to be viewable by anyone. With your bucket selected in the Buckets list, click “Properties” and then click on “Edit bucket policy” under the “Permissions” tab that appears. Enter the following text:

{
  "Version": "2008-10-17",
  "Id": "924a2348-de0e-43aa-bb06-83adbcd1db22",
  "Statement": [
    {
      "Sid": "PublicReadForGetBucketObjects",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::www.example.com/*"
    }
  ]
}

where www.example.com is replaced with the name of your bucket. Click “Save” and “Close”.

Finally you need to tell S3 that you’d like to use this bucket as a website. Click on “Website” at the bottom of the page and click the “Enabled” checkbox. “Index Document” is the name of the page you’d like served when someone requests a directory: for example, I’d like requests for /foobar/ on my site to return the page at /foobar/index.html, so I entered “index.html” in this field. If you have a page appropriate for serving when someone gets e.g. a 404 error, you can enter that page’s name in “Error Document”.

That does it for the configuration of your S3 bucket. While you’re in the “Website” tab, take note of the “Endpoint” URL; you’ll need that to configure your DNS later. I’ll also take this opportunity to mention that while you can manage your bucket using the AWS Management Console and upload stuff into it with jekyll-s3 (as we’ll see), the excellent Mac apps Transmit and Cyberduck will also let you explore and manage your S3 data.

Getting your stuff into S3

Now that your bucket is all configured, you need to upload your site into it. I spent hours unsuccessfully trying to do this with FUSE and rsync, but then I found jekyll-s3 (since renamed s3_website), a Gem designed expressly to push your Jekyll-generated site to S3. Follow the installation and setup instructions for that Gem. (Your S3 ID and secret can be found by going to the AWS Management Console and clicking “Account” and then “Security Credentials”.) Running jekyll will build your site and then s3_website push will publish it. Slick, huh?

Setting up your domain

The final step is to adjust your DNS settings so that your domain name points to your S3 website. (You’ll have to do this on your domain registrar’s website, and interfaces for these vary, so my directions here are general.) The domain you’ll be pointing to is the “Endpoint” listed in the AWS Management Console.2

First set a CNAME record for the “www” subdomain — something like

www.example.com CNAME www.example.com.s3-website-us-east-1.amazonaws.com.

Yes, that trailing period is supposed to be there.

Now we need to deal with how to handle the “apex” or “naked” domain, example.com with no “www.” out front. Due to vagaries of DNS that I still don’t understand, you can’t just set a CNAME for this record (which is sometimes called “@”). In the case of Amazon S3, which isn’t designed to be a web host per se, we need to use an outside service to handle the domain apex. I found a service called WWWizer which is extremely easy to use: you don’t even need to sign up; just create a DNS record like

example.com A 174.129.25.170

and the site will automatically redirect requests for example.com to www.example.com. For a small fee, you can also have that site keep your visitors on what looks like example.com while you actually serve content from www.example.com. This seems to be the way to go if you’re set on hosting content from [what looks like] the apex domain with S3.

Obligatory “conclusion” section

At this point I’ve given instructions for most of the things you’ll need to do in order to host a Jekyll site with Amazon S3; the steps I’ve glossed over are things for which a lot of documentation is available. I hope what I have written is straightforward. I make no guarantees that I’ll have the time or knowledge to answer questions, but if you’d like to try your luck, feel free to e-mail me from the link at the bottom of the page. Thanks for reading!

  1. As I said, I’m not a Ruby expert, so I may be missing some of the subtleties of e.g. why RVM was created. ↩︎

  2. Specifically, be sure you’re using a domain name like www.example.com.s3-website-us-east-1.amazonaws.com rather than just www.example.com.s3.amazonaws.com. ↩︎