S3 Static Hosting

A static nuxt website deploying to a AWS S3 bucket

Introduction

NuxtContent allows me to generate a content centered blog post or article by writing in Markdown. I can author with simple text editor without fussing with HTML codes, css, etc. It even handles things like custom SEO and Social Sharing metadata with front-matter variables at the top of the Markdown file. The Markdown document encapsulates all the key programmer principles of simplicity, flexibility, readability, and self-containment. Indeed this blog post was written this way. Further, with the use the Nuxt npm run generate command I can turn my fully client functional Vue.js/Nuxt.js web application into a static website that I can deploy to a cheap storage hosting solution such as Amazon AWS S3 Static website hosting.

Static Files Layout

When I author a page article and then generate a static site with NuxtContent (i.e. npm run generate), it converts each Markdown document and the supporting Nuxt/Vue framework files into an html file, a payload file, and a directory. The layout the files has two options:

  1. Directory (default)
  2. Flat

The output layout choice is controlled by the sutoSubfolderIndex key in nuxt.config.ts file.

{
  nitro: {
    prerender: {
      autoSubfolderIndex: false,
    }
  }
}

Directory Index.html

When autoSubfolderIndex: true (and it is true by default), the two files and one directory that get created from example.md are:

example/
example/index.html
example/_payload.json

PageName.html Output

When autoSubfolderIndex: false, the two files and one directory that get created are:

example.html
example/
example/_payload.json

S3 Static Host Issues

Amazon AWS S3 bucket Icon

The AWS S3 bucket service can host static websites directly, even though recently a new capability to use AWS Amplify Hosting is recommended. In trying to keep the hosting costs to minimum we use the AWS S3 service. In making that choice, the following issues are relevant for using AWS S3 for static website hosting:

  1. S3 Object Storage
  2. S3 Web Server Behavior
  3. Trailing Slash

Object Storage is not a file system

AWS S3 Service is actually an object store and not a file system. The differences between the two are normally negligible but are important for static web hosting. Specifically each file and directory gets created as an object with a key name.

Directory Index.html Bucket Objects

example/index.html
example/_payload.json
example/

PageName.html Bucket Objects

example.html
example/_payload.json
example/

S3 Web Server Behavior

The S3 Web Server behavior is 'quirky' when it comes to pages and pages in directories.

Home

For the root-level URL, i.e. home, the trailing slash is optional. AWS S3 Static website hosting has a built in property specifying the html file to deliver, usually index.html at the root. This is setup as a property of static website hosting for the 'Index document'. If the user types http://<root url> or http://<root url>/ it will deliver the index.html file specified from the property.

Pages

However, for all other non root-level URLs there is a different algorithm for determining the index.html for that URL. The behavior is as described in their AWS S3 website hosting documentation 'Configuring an index document'.

Summarizing that behavior:

  1. Trailing Slash - If the non root-level URL has a trailing slash / it will return /index.html.
    • For example, for http://<root url>/photos/ it will return http://<root url>/photos/index.html.
  2. No Trailing Slash - But if the non root-level URL does not have a trailing slash AWS S3 first looks for a page name object in the bucket (NOTE: it does not look for an object page name with an .html extension, just the page name). If not found, it searches page name folder for an index.html document, and if that document is found in the folder, AWS S3 returns a 302 "Found but temporarily moved to a new location" message with a location pointing to page name folder with a trailing slash. If the index document is not found in the folder, AWS S3 static web server returns an error.
    • For example, for http://<root url>/photos, it will look for photos object in the root bucket folder. If the photos object is not found in the root bucket folder, it searches for an index document in the page folder object photos/index.html. If that document is found, AWS S3 returns a 302 Found message and with a location that points to the photos/ key. If the index document is not found, AWS S3 static web server returns an error.

302 vs. 301

Why AWS S3 static web server uses a a 302 error instead of a 301 Permanently Moved error in not clear. A 301 would help web crawlers like Google or Bing to register that photos should be photos/. For example, an http GET at pennockprojects.com/photos will return a 302 status code location with pennockprojects.com/photos/ redirect.

One can implement a solution to change 302 to 301 by using a Lambda function for CloudFront Edge requests. See this blog for details.

Trailing Slash is Trash

While a site will generally work when you employ the Nuxt Static Site Generation Directory Index.html and host it on Amazon AWS S3, there are significant drawbacks.

  1. All your SEO pages will have a trailing slash and it looks obsolete.
  2. All your direct to pages URLs load slower if a trailing slash is omitted, due 302 re-direct happen.
    • In testing, a page without a trailing slash required 77 network calls whereas with a trailing slash was only 56 network calls.
  3. Social 'share' buttons and users copying the website URL from the the browser bar will not include a trailing slash. If that link is embedded into a social media post, the social media platform crawler knows it does not get a 200 response and won't preview the page.
    • For example, X/Twitter tweet/post/DM will not provide site preview in the composer.
  4. Sitemap generation, internal NuxtLink and <a href=''></a> elements won't contain the ending slash either which will confuse crawlers with constant 301 errors as it crawls.

The lack of easy social share and user url posting (last two items above) is a non-starter for content heavy sites like a blog or articles site.

X Validator Issues

At the X/Twitter card validator tool you can see the difference

Twitter Card Validator - Warning

A static site URL without a trailing slash generates warnings

Twitter Card Validator - Clean

A static site URL + trailing slash gives no warning

An Experiment

Not wanting to move to AWS Amplify for a static site I wrestled with this issue for my own blog and for my JAMStart project when I discovered a way forward.

Trying PageName.html Output

In both Nuxt Static Site Generation output options (Directory Index.html or PageName.html) the html document has an .html extension. However in the SSG PageName.html output option, the html document is at the right level (not in a folder) and has the page name. For example, here is the generated files for the projects page using PageName.html output.

projects.html
projects/
projects/_payload.json

Problems with Vanilla PageName.html

There are problems with PageName.html output as well. Specifically, direct page URLs never resolved with the Amazon AWS S3 output. It would if you added the extension .html, i.e. https://pennockprojects.com/projects.html worked where as https://pennockprojects.com/projects did not.

AWS S3 Object Storage

In reading about how the AWS S3 website hosting index documents and folders. The first item of what happens for a non-trailing slash page jumped out to me.

if you exclude the trailing slash from the preceding URL, Amazon S3 first looks for an object (emphasis mine) photos in the bucket

What if I renamed the projects.html to projects?

When a http get for /projects is served, the Amazon AWS S3 static web server will find the projects object as the first thing.

This is only possible because unlike a file system where you can't have both a projects and projects/ directory, you can have both on a key object file system as they technically have two different names. I manually renamed the html document.

projects.html
projects/
projects/_payload.json
Amazon S3 Bucket Directory and HTML same name

Amazon S3 Bucket Objects with Page Name and Page Name Folder

Success

This Worked!

  • Direct page URLs were clean with no trailing slash
  • A HTTP Get produced a 200
  • The full content nuxt app loaded up perfectly and quickly.
  • X/Twitter Validator, social posting clean with SEO and Social Metatags
Twitter Card Validator No Trailing Slash Clean

A static site URL without a trailing slash 200 clean

Next Steps

In order to operationalize this experiment, the manual steps need to be removed. Specifically.

  1. Use the Nuxt SSG PageName.html layout option for the build step
  2. Remove the .html extensions after the deploy step and the files have been converted to S3 objects.

Stay Tuned!