Nuxt Static Site S3 Fix

The nuxtss-s3-fix is a CICD tool for optimizing Nuxt.js Static Sites HTML page objects to work with the Amazon AWS S3 bucket static web site hosting. These optimizations improve:
- Social Media sharing shows article images and meta data.
- SEO search crawlers record the page meta data
- Direct page loading times
In my testing, I built my sites using the using Nuxt.js flat static layout, then ran the commands that the tool provided which arranged the HTML page objects into a same as directory layout. After applying the fix, I verified that social media sharing and SEO crawlers worked as expected. I found that direct page loading was faster. No observed negative effects of this arrangement were found, but your mileage may vary.
The tool can be used in a local CLI environment or in a CICD pipeline script. It generates AWS S3 CLI commands that you can run in your AWS S3 CLI context. The tool does not make any changes to your S3 bucket, it only generates the commands for you to run.
Prerequisites
In order for the tool to produce appropriate AWS CLI S3 commands you must provide the following:
- An AWS S3 bucket containing all the objects as built by Nuxt Static Site generator using
npm run generateandautoSubfolderIndexsetting in thenuxt.config.jsfile.autoSubfolderIndex: true= indexautoSubfolderIndex: false= flat
- By default, it expects an accurate
sitemap.xmlobject describing the page HTML objects in the bucket root. If you use a sitemap Nuxt.js module, such as@nuxtjs/sitemap, this file gets generated automatically.- Optionally, you can specific a different
sitemap.xmlfile, using the--sitemap-fileoption. This file can be a local file, an https: file, or in another S3 bucket.
- Optionally, you can specific a different
- The tool needs an inherited AWS S3 CLI context with the
getObjectpermission for the site S3 bucket.- important
nuxtss-s3-fixexclusively uses the AWS S3 CLI context in its execution space and does not store or request AWS S3 permissions - If you are running the tool in a CICD pipeline action, ensure that the action has the appropriate AWS S3 context permissions.
- important
Static Site HTML Page Layouts
When building a Nuxt Static Site for AWS S3 bucket hosting, each page (we'll use example) HTML object will be in one of two layout arrangements depending on the autoSubfolderIndex setting in the nuxt.config.js file.
- In the flat layout (
autoSubfolderIndex: false) the HTML page object has a.htmlextension and is a peer next to a directory with the page name.example/example/_payload.jsonexample.html- flat HTML page object
- In the index layout (
autoSubfolderIndex: true) the HTML page object is placed in the<page>directory with the nameindex.htmlpage generates files and bucket objects:example/example/_payload.jsonexample/index.html- index HTML page object
- In the same as directory layout, which is the ideal arrangement for S3 static web site hosting, the HTML page object has no extension and is named the same as the directory object. This is only legal within an AWS S3 bucket, where a directory and an object have the same name, but this not allowed in most file system. The ideal arrangement of the S3 bucket objects for each page looks like this:
example/example/_payload.jsonexample- same as directory HTML page object
Flat or Index Problems
When using a Nuxt static site on an Amazon S3 bucket, the flat and index arrangements have the following issues:
- Social Media sharing of a page URL does not show the article image or meta data.
- SEO search crawlers do not record the page meta data.
- Direct page loading times are slower than they could be.
More details about this can be found at: AWS S3 Configuring an index document and Blog on S3 Static Hosting Issues
S3 Web Server Quirks
The key points of the S3 Web Server quirky behavior are:
- For each page URL request with a trailing slash, i.e.
photo/, the AWS S3 Web Server will return 1. 200 OK ifs3://bucket/photos/index.htmlexists 2. 404 Not Found if not found - For each page URL request without a tailing slash, i.e.
photo, the AWS S3 Web Server will return 1. 200 OK ifs3://bucket/photosexists 2. 302 Temporarily Moved ifs3://bucket/photos/index.htmlexists (it will redirect to the URL with the trailing slash, i.e.http://example.com/photos/which causes a reload) 3. 404 Not Found if not found
Router Navigation vs Direct Load
Once the Nuxt.js web app is loaded from the home page, the Vue.js router will handle any further navigation within the app and update the URL directly. This means that if a user starts at the home page http://example.com and then clicks on a link to the photo page, the URL will be updated to http://example.com/photo and the Nuxt.js app knows how to load that page HTML and content without interacting with the S3 site web hosting quirkiness. It works as expected.
However, when the page URL is directly requested, i.e. http://example.com/photo or http://example.com/photo/, the S3 web server quirkiness comes into play. The page HTML object has all the information it needs to load the Nuxt.js web app and display the proper page information, but the S3 web server has to deliver that page HTML object first. This is where the problems with the flat and index layouts arise.
Flat Direct Load Issues
When the Nuxt static site is built using the flat layout, each S3 bucket HTML page object is named <page>.html. This means than any URL requests (except home) which doesn't include the .html extension, regardless of the trailing slash, will get a 404. If the URL request includes the .html extension, it will work, but that is not a user friendly URL.
For example:
- If a
photoURL request arrives, the AWS S3 Web Server will not find aphotoHTML object or aphoto/index.htmlobject and return a 404 Not Found. - If
photo/URL requests arrives, the AWS S3 Web Server will not find aphoto/index.htmlHTML object and return a 404 Not Found. - If a
photo.htmlURL request arrives, the AWS S3 Web Server will find thephoto.htmlHTML object and return a 200 OK.
Index Direct Load Issues
When the Nuxt static site is built using the index layout, each S3 bucket HTML page object is named <page>/index.html. If the URL request does not include the trailing slash, it will work, but will receive a 302 Temporarily moved reload to the page with the trailing slash. If the URL request includes the trailing slash will receive a 200 OK.
For example:
- If a
photoURL request arrives, the AWS S3 Web Server will not find aphotoHTML object, but will find thephoto/index.htmlobject and return a 302 Temporarily Moved tophoto/. - If
photo/URL requests arrives, the AWS S3 Web Server will find thephoto/index.htmlHTML object and return a 200 OK.
Generally the index layout is better than the flat layout, as the Nuxt app will generally load, but the 302 redirect is not ideal and makes your site less useful to users for sharing and can slow down the page load time for the redirect.
Same As Directory Direct Load
When the Nuxt static site is optimized by the nuxt-ss-fix tool into the same as directory layout, each S3 bucket HTML page object is named <page>. If the URL request does not include the trailing slash it will respond with 200 OK. If a URL request includes the trailing slash, it will get a 404 Not Found, but that is a logical response.
For example:
- If a
photoURL request arrives, the AWS S3 Web Server will find aphotoHTML object and return a 200 OK. - If
photo/URL requests arrives, the AWS S3 Web Server will not find aphoto/index.htmlHTML object and return a 404 Not Found.
The natural user expectation is that photo is the page and photo/ is a directory, and the 404 Not Found response is logical.
Avoiding 404 Not Found for URL with trailing slash
If you want to avoid the 404 Not Found response for URL with slashes in the Same As Directory layout, you can duplicate the HTML page object to exist at both photo and photo/index.html. This seems overkill, in my opinion, but if need this or prefer, the procedure to do this is:
- Generate an index Nuxt.js web app and deploy to your bucket. This will contain
<page>/index.htmlHTML bucket objects. - Run the tool just once for the copy commands to copy them to the
<page>HTML bucket object and never remove the<page>/index.htmlHTML bucket objects.
This way either URL request will work, but you are duplicating each HTML page object.
Tool Operation
The nuxtss-s3-fix tool will generate commands to copy HTML Page objects into the Same as Directory layout and generate commands to remove duplicate Index or Flat HTML Page objects.
The sequence that the tool was designed to operate in is:
- Deploy a Nuxt.js static site (either generated with the Index (recommended) or Flat layout) to an AWS S3 bucket
- Run the tool to generate the copy commands.
- Execute the copy commands in your AWS S3 CLI context.
- Verify that the site works as expected.
- Run the tool to generate the remove commands.
- Execute the remove commands in your AWS S3 CLI context.
- Verify that the site works as expected.
The tool will not generate a copy command for a page if a Same as Directory HTML page object already exists. This allows you to re-run the tool to generate copy commands for any pages that may have failed to copy in a previous run. For example, if you have a large number of pages to copy and your network connection is unstable, you can run the tool again to generate copy commands for only the pages that were not copied.
The tool will not generate a remove command for the Index or Flat HTML pages unless the Same as Directory HTML page object already exists. This allows you to re-run the tool to generate remove commands for any pages that may have failed to be removed. For safety, the tool will not generate remove commands unless HTML page duplication exists. If only one copy of the page exists, it will not be removed.
This means that you can run the tool multiple times to ensure that all pages are copied and removed successfully.
For example, if you have 100 pages to copy, you run the tool to generate the copy commands and execute them. You verify that the site works as expected. You then run the tool again to generate the remove commands and execute them. If there were any errors in the remove commands, you can re-run the tool to generate remove commands for only the pages that were not removed.
The tool will read the sitemap.xml file in the root of the S3 bucket to determine the paths that need to be fixed. It will then check for the existence of the current page HTML object and the new <page> object. Based on this information it will generate the appropriate AWS S3 CLI commands to perform the copy and remove operations.
Copy Process
The Copy process will generate copy commands for either Index <page>/index.html or Flat <page>.html S3 page objects to Same as Directory S3 page object <page>.
It will only generate copy commands for paths found in the sitemap.xml that correspond to a Nuxt bucket object in the flat or index arrangements and there is no existing object at the new <page> location. It skips paths that do not have a corresponding HTML object in the S3 bucket. This means that if you have a path in the sitemap.xml that is not a page, such as an image or other asset, no copy command will be generated.
The Copy Process will only generate copy commands for paths found in the sitemap.xml that correspond to a Nuxt bucket object in the flat or index arrangements when there is not an existing object at the new <page> location.
If there were errors when executing the copy commands, you can rerun the tool and generate a new set of copy commands that only apply to the ones that were not copied.
Remove Process
The Remove Process will only generate remove commands for
- paths found in the
sitemap.xmlcorrespond to a Nuxt bucket object in the flat or index arrangements - when there is an existing object at the new
<page>location.
If there were errors when executing the remove commands, you can re-run the tool and generate a new set of remove commands that only apply to the ones that were not removed.
Flat Example
The new arrangement of the S3 bucket objects uses the feature of Amazon S3 objects where a directoy object and a file object can have the same name at the same level. The ideal arrangment of bucket objects for each page looks like this:
exampleexample/example/_payload.json
Flat Site commands
The AWS S3 CLI commands created for each flat page
- copy command:
aws s3 cp s3://<bucket>/example.html s3://bucket/example - delete command:
aws s3 rm s3://<bucket>/example.html
Index in Folder Commands
The AWS S3 CLI commands created for each 'Index in Folder' page
- copy command:
aws s3 cp s3://<bucket>/example/index.html s3://bucket/example - delete command:
aws s3 rm s3://<bucket>/example/index.html
Delete considerations
Using the tools copy commands only, the Nuxt static site will operate normally. The existing .html file does not cause any problems that I have found. But you are duplicating each page HTML file.
The delete commands are created to remove the original page .html that was copied and only should be removed if the copy was successful.
Program Requirements
- Input S3 bucket location
- AWS S3
getObjectpermission context with to the S3 bucket - sitemap.xml in root of bucket.
- CLI tool for command-line or from CICD script
- For Nuxt Static Flat site
<page>.html=><page> - For Nuxt Static Subfolder site
<page>\<page>.html=<page> - Option for removing the original
.htmlfile - Validate each sitemap path entry to a distinct
<page>.htmlfile - delete command only generated if there is a new
<page>object