How to Speed Up WordPress Asset Delivery Using DigitalOcean Spaces CDN

Introduction

Implementing a CDN, or Content Delivery Network, to deliver your WordPress site’s static assets can greatly decrease your servers’ bandwidth usage as well as speed up page load times for geographically dispersed users. WordPress static assets include images, CSS stylesheets, and JavaScript files. Leveraging a system of edge servers distributed worldwide, a CDN caches copies of your site’s static assets across its network to reduce the distance between end users and this bandwidth-intensive content.

In a previous Solutions guide, How to Store WordPress Assets on DigitalOcean Spaces, we covered offloading a WordPress site’s Media Library (where images and other site content gets stored) to DigitalOcean Spaces, a highly redundant object storage service. We did this using the DigitalOcean Spaces Sync plugin, which automatically syncs WordPress uploads to your Space, allowing you to delete these files from your server and free up disk space.

In this Solutions guide, we’ll extend this procedure by rewriting Media Library asset URLs. This forces users’ browsers to download static assets directly from the DigitalOcean Spaces CDN, a geographically distributed set of cache servers optimized for delivering static content. We’ll go over how to enable the CDN for Spaces, how to rewrite links to serve your WordPress assets from the CDN, and finally how to test that your website’s assets are being correctly delivered by the CDN.

Additionally, we’ll demonstrate how to implement Media Library offload and link rewriting using two popular paid WordPress plugins: WP Offload Media and Media Library Folders Pro. You should choose the plugin that suits your production needs best.

Prerequisites

Before you begin this tutorial, you should have a running WordPress installation on top of a LAMP or LEMP stack. You should also have WP-CLI installed on your WordPress server, which you can learn to set up by following these instructions.

To offload your Media Library, you’ll need a DigitalOcean Space and an access key pair:

  • To learn how to create a Space, consult the Spaces product documentation.
  • To learn how to create an access key pair and upload files to your Space using the open source s3cmd tool, consult s3cmd 2.x Setup, also on the DigitalOcean product documentation site.

There are a few WordPress plugins that you can use to offload your WordPress assets:

  • DigitalOcean Spaces Sync is a free and open-source WordPress plugin for offloading your Media Library to a DigitalOcean Space. You can learn how to do this in How To Store WordPress Assets on DigitalOcean Spaces.
  • WP Offload Media is a paid plugin that copies files from your WordPress Media Library to DigitalOcean Spaces and rewrites URLs to serve the files from the CDN. With the Assets Pull addon, it can identify assets (CSS, JS, images, etc) used by your site (for example by WordPress themes) and also serve these from CDN.
  • Media Library Folders Pro is another paid plugin that helps you organize your Media Library assets, as well as offload them to DigitalOcean Spaces.

For testing purposes, be sure to have a modern web browser such as Google Chrome or Firefox installed on your client (e.g. laptop) computer.

Once you have a running WordPress installation and have created a DigitalOcean Space, you’re ready to enable the CDN for your Space and begin with this guide.

Enabling Spaces CDN

We’ll begin this guide by enabling the CDN for your DigitalOcean Space. This will not affect the availability of existing objects. With the CDN enabled, objects in your Space will be “pushed out” to edge caches across the content delivery network, and a new CDN endpoint URL will be made available to you. To learn more about how CDNs work, consult Using a CDN to Speed Up Static Content Delivery.

First, enable the CDN for your Space by following How to Enable the Spaces CDN.

Navigate back to your Space and reload the page. You should see a new Endpoints link under your Space name:

Endpoints Link

These endpoints should contain your Space name. We’re using wordpress-offload in this tutorial.

Notice the addition of the new Edge endpoint. This endpoint routes requests for Spaces objects through the CDN, serving them from the edge cache as much as possible. Note down this Edge endpoint, which you’ll use to configure your WordPress plugin in future steps.

Now that you have enabled the CDN for your Space, you’re ready to begin configuring your asset offload and link rewriting plugin.

If you’re using DigitalOcean Spaces Sync and continuing from How to Store WordPress Assets on DigitalOcean Spaces, begin reading from the following section. If you’re not using Spaces Sync, skip to either the WP Offload Media section or the Media Library Folders Pro section, depending on the plugin you choose to use.

Spaces Sync Plugin

If you’d like to use the free and open-source DigitalOcean Spaces Sync and CDN Enabler plugins to serve your files from the CDN’s edge caches, follow the steps outlined in this section.

We’ll begin by ensuring that our WordPress installation and Spaces Sync plugin are configured correctly and are serving assets from DigitalOcean Spaces.

Modifying Spaces Sync Plugin Configuration

Continuing from How To Store WordPress Assets on DigitalOcean Spaces, your Media Library should be offloaded to your DigitalOcean Space and your Spaces Sync plugin settings should look as follows:

Sync Cloud Only

We are going to make some minor changes to ensure that our configuration allows us to offload WordPress themes and other directories, beyond the wp-content/uploads Media Library folder.

First, we’re going to modify the Full URL-path to files field so that the Media Library files are served from our Space’s CDN and not locally from the server. This setting essentially rewrites links to Media Library assets, changing them from file links hosted locally on your WordPress server, to file links hosted on the DigitalOcean Spaces CDN.

Recall the Edge endpoint you noted down in the Enabling Spaces CDN step.

In this tutorial, the Space’s name is wordpress-offload and the Space’s CDN endpoint is:

https://wordpress-offload.nyc3.cdn.digitaloceanspaces.com 

Now, in the Spaces Sync plugin settings page, replace the URL in the Full URL-path to files field with your Spaces CDN endpoint, followed by /wp-content/uploads.

In this tutorial, using the above Spaces CDN endpoint, the full URL would be:

https://wordpress-offload.nyc3.cdn.digitaloceanspaces.com/wp-content/uploads 

Next, for the Local path field, enter the full path to the wp-content/uploads directory on your WordPress server. In this tutorial, the path to the WordPress installation on the server is /var/www/html/, so the full path to uploads would be /var/www/html/wp-content/uploads.

Note: If you’re continuing from How To Store WordPress Assets on DigitalOcean Spaces, this guide will slightly modify the path to files in your Space to enable you to optionally offload themes and other wp-content assets. You should clear out your Space before doing this, or alternatively you can transfer existing files into the correct wp-content/uploads Space directory using s3cmd.

In the Storage prefix field, we’re going to enter /wp-content/uploads, which will ensure that we build the correct wp-content directory hierarchy so that we can offload other WordPress directories to this Space.

Filemask can remain wildcarded with *, unless you’d like to exclude certain files.

It’s not necessary to check the Store files only in the cloud and delete… option; only check this box if you’d like to delete the Media Library assets from your server after they’ve been successfully uploaded to your DigitalOcean Space.

Your final settings should look something like this:

Final Spaces Sync Settings

Be sure to replace the above values with the values corresponding to your WordPress installation and Spaces configuration.

Finally, hit Save Changes.

You should see a Settings saved box appear at the top of your screen, confirming that the Spaces Sync plugin settings have successfully been updated.

Future WordPress Media Library uploads should now be synced to your DigitalOcean Space, and served using the Spaces Content Delivery Network.

In this step, we did not offload the WordPress theme or other wp-content assets. To learn how to transfer these assets to Spaces and serve them using the Spaces CDN, skip to Offload Additional Assets.

To verify and test that your Media Library uploads are being delivered from the Spaces CDN, skip to Test CDN Caching.

WordPress Offload Media Plugin

The DeliciousBrains WordPress Offload Media plugin allows you to quickly and automatically upload your Media Library assets to DigitalOcean Spaces and rewrite links to these assets so that you can deliver them directly from Spaces or via the Spaces CDN. In addition, the Assets Pull addon allows you to quickly offload additional WordPress assets like JS, CSS, and font files in combination with a pull CDN. Setting up this addon is beyond the scope of this guide but to learn more you can consult the DeliciousBrains documentation.

We’ll begin by installing and configuring the WP Offload Media plugin for a sample WordPress site.

Installing WP Offload Media Plugin

To begin, you must purchase a copy of the plugin on the DeliciousBrains plugin site. Choose the appropriate version depending on the number of assets in your Media Library, and support and feature requirements for your site.

After going through checkout, you’ll be brought to a post-purchase site with a download link for the plugin and a license key. The download link and license key will also be sent to you at the email address you provided when purchasing the plugin.

Download the plugin and navigate to your WordPress site’s admin interface (https://your_site_url/wp-admin). Log in if necessary. From here, hover over Plugins and click on Add New.

Click Upload Plugin and the top of the page, Choose File, and then select the zip archive you just downloaded.

Click Install Now, and then Activate Plugin. You’ll be brought to WordPress’s plugin admin interface.

From here, navigate to the WP Offload Media plugin’s settings page by clicking Settings under the plugin name.

You’ll be brought to the following screen:

WP Offload Media Configuration

Click the radio button next to DigitalOcean Spaces. You’ll now be prompted to either configure your Spaces Access Key in the wp-config.php file (recommended), or directly in the web interface (the latter will store your Spaces credentials in the WordPress database).

We’ll configure our Spaces Access Key in wp-config.php.

Log in to your WordPress server via the command line, and navigate to your WordPress root directory (in this tutorial, this is /var/www/html). From here, open up wp-config.php in your favorite editor:

  • sudo nano wp-config.php

Scroll down to the line that says /* That's all, stop editing! Happy blogging. */, and before it insert the following lines containing your Spaces Access Key pair (to learn how to generate an access key pair, consult the Spaces product docs):

wp-config.php
. . .  define( 'AS3CF_SETTINGS', serialize( array(     'provider' => 'do',     'access-key-id' => 'your_access_key_here',     'secret-access-key' => 'your_secret_key_here', ) ) );  /* That's all, stop editing! Happy blogging. */ . . . 

Once you’re done editing, save and close the file. The changes will take effect immediately.

Back in the WordPress Offload Media plugin admin interface, select the radio button next to Define access keys in wp-config.php and hit Save Changes.

You should be brought to the following interface:

WP Offload Bucket Selection

On this configuration page, select the appropriate region for your Space using the Region dropdown and enter your Space name next to Bucket (in this tutorial, our Space is called wordpress-offload).

Then, hit Save Bucket.

You’ll be brought to the main WP Offload Media configuration page. At the top you should see the following warning box:

WP Offload License

Click on enter your license key, and on the subsequent page enter the license key found in your email receipt or on the checkout page and hit Activate License.

If you entered your license key correctly, you should see License activated successfully.

Now, navigate back to main WP Offload Media configuration page by clicking on Media Library at the top of the window.

At this point, WP Offload Media has successfully been configured for use with your DigitalOcean Space. You can now begin offloading assets and delivering them using the Spaces CDN.

Configuring WP Offload Media

Now that you’ve linked WP Offload Media with your DigitalOcean Space, you can begin offloading assets and configuring URL rewriting to deliver media from the Spaces CDN.

You should see the following configuration options on the main WP Offload Media configuration page:

WP Offload Main Nav

These defaults should work fine for most use cases. If your Media Library exists at a nonstandard path within your WordPress directory, enter the path in the text box under the Path option.

If you’d like to change asset URLs so that they are served directly from Spaces and not your WordPress server, ensure the toggle is set to On next to Rewrite Media URLs.

To deliver Media Library assets using the Spaces CDN, ensure you’ve enabled the CDN for your Space (see Enable Spaces CDN to learn how) and have noted down the URL for the Edge endpoint. Hit the toggle next to Custom Domain (CNAME), and In the text box that appears, enter the CDN Edge endpoint URL, without the https:// prefix.

In this guide the Spaces CDN endpoint is:

https://wordpress-offload.nyc3.cdn.digitaloceanspaces.com 

So here we enter:

 wordpress-offload.nyc3.cdn.digitaloceanspaces.com 

To improve security, we’ll force HTTPS for requests to Media Library assets (now served using the CDN) by setting the toggle to On.

You can optionally clear out files that have been offloaded to Spaces from your WordPress server to free up disk space. To do this, hit On next to Remove Files From Server.

Once you’ve finished configuring WP Offload Media, hit Save Changes at the bottom of the page to save your settings.

The URL Preview box should display a URL containing your Spaces CDN endpoint. It should look something like the following:

https://wordpress‑offload.nyc3.cdn.digitaloceanspaces.com/wp‑content/uploads/2018/09/21211354/photo.jpg

This URL indicates that WP Offload Media has been successfully configured to deliver Media Library assets using the Spaces CDN. If the path doesn’t contain cdn, ensure that you correctly entered the Edge endpoint URL and not the Origin URL.

At this point, WP Offload Media has been set up to deliver your Media Library using Spaces CDN. Any future uploads to your Media Library will be automatically copied over to your DigitalOcean Space and served using the CDN.

You can now bulk offload existing assets in your Media Library using the built-in upload tool.

Offloading Media Library

We’ll use the plugin’s built-in “Upload Tool” to offload existing files in our WordPress Media Library.

On the right-hand side of the main WP Offload Media configuration page, you should see the following box:

WP Offload Upload Tool

Click Offload Now to upload your Media Library files to your DigitalOcean Space.

If the upload procedure gets interrupted, the box will change to display the following:

WP Offload Upload Tool 2

Hit Offload Remaining Now to transfer the remaining files to your DigitalOcean Space.

Once you’ve offloaded the remaining items from your Media Library, you should see the following new boxes:

WP Offload Success

At this point you’ve offloaded your Media Library to your Space and are delivering the files to users using the Spaces CDN.

At any point in time, you can download the files back to your WordPress server from your Space by hitting Download Files.

You can also clear out your DigitalOcean Space by hitting Remove Files. Before doing this, ensure that you’ve first downloaded the files back to your WordPress server from Spaces.

In this step, we learned how to offload our WordPress Media Library to DigitalOcean Spaces and rewrite links to these Library assets using the WP Offload Media plugin.

To offload additional WordPress assets like themes and JavaScript files, you can use the Asset Pull addon or consult the Offload Additional Assets section of this guide.

To verify and test that your Media Library uploads are being delivered from the Spaces CDN, skip to Test CDN Caching.

Media Library Folders Pro and CDN Enabler Plugins

The MaxGalleria Media Library Folders Pro plugin is a convenient WordPress plugin that allows you to better organize your WordPress Media Library assets. In addition, the free Spaces addon allows you to bulk offload your Media Library assets to DigitalOcean Spaces, and rewrite URLs to those assets to serve them directly from object storage. You can then enable the Spaces CDN and use the Spaces CDN endpoint to serve your library assets from the distributed delivery network. To accomplish this last step, you can use the CDN Enabler plugin to rewrite CDN endpoint URLs for your Media Library assets.

We’ll begin by installing and configuring the Media Library Folders Pro (MLFP) plugin, as well as the MLFP Spaces addon. We’ll then install and configure the CDN Enabler plugin to deliver Media Library assets using the Spaces CDN.

Installing MLFP Plugin

After purchasing the MLFP plugin, you should have received an email containing your MaxGalleria account credentials as well as a plugin download link. Click on the plugin download link to download the MLFP plugin zip archive to your local computer.

Once you’ve downloaded the archive, log in to your WordPress site’s administration interface (https://your_site_url/wp-admin), and navigate to Plugins and then Add New in the left-hand sidebar.

From the Add Plugins page, click Upload Plugin and then select the zip archive you just downloaded.

Click Install Now to complete the plugin installation, and from the Installing Plugin screen, click Activate Plugin to activate MLFP.

You should then see a Media Library Folders Pro menu item appear in the left-hand sidebar. Click it to go to the Media Library Folders Pro interface. Covering the plugin’s various features is beyond the scope of this guide, but to learn more, you can consult the MaxGalleria site and forums.

We’ll now activate the plugin. Click into Settings under the MLFP menu item, and enter your license key next to the License Key text box. You can find your MLFP license key in the email sent to you when you purchased the plugin. Hit Save Changes and then Activate License. Next, hit Update Settings.

Your MLFP plugin is now active, and you can use it to organize existing or new Media Library assets for your WordPress site.

We’ll now install and configure the Spaces addon plugin so that you can offload and serve these assets from DigitalOcean Spaces.

Installing MLFP Spaces Addon Plugin and Offload Media Library

To install the Spaces Addon, log in to your MaxGalleria account. You can find your account credentials in an email sent to you when you purchased the MLFP plugin.

Navigate to the Addons page in the top menu bar and scroll down to Media Sources. From here, click into the Media Library Folders Pro S3 and Spaces option.

From this page, scroll down to the Pricing section and select the option that suits the size of your WordPress Media Library (for Media Libraries with 3000 images or less, the addon is free).

After completing the addon “purchase,” you can navigate back to your account page (by clicking the Account link in the top menu bar), from which the addon plugin will now be available.

Click on the Media Library Folders Pro S3 image and the plugin download should begin.

Once the download completes, navigate back to your WordPress administration interface, and install the downloaded plugin using the same method as above, by clicking Upload Plugin. Once again, hit Activate Plugin to activate the plugin.

You will likely receive a warning about configuring access keys in your wp-config.php file. We’ll configure these now.

Log in to your WordPress server using the console or SSH, and navigate to your WordPress root directory (in this tutorial, this is /var/www/html). From here, open up wp-config.php in your favorite editor:

  • sudo nano wp-config.php

Scroll down to the line that says /* That's all, stop editing! Happy blogging. */, and before it insert the following lines containing your Spaces Access Key pair and a plugin configuration option (to learn how to generate an access key pair, consult the Spaces product docs):

wp-config.php
. . .  define('MF_AWS_ACCESS_KEY_ID', 'your_access_key_here'); define( 'MF_AWS_SECRET_ACCESS_KEY', 'your_secret_key_here'); define('MF_CLOUD_TYPE', 'do')  /* That's all, stop editing! Happy blogging. */ . . . 

Once you’re done editing, save and close the file.

Now, navigate to your DigitalOcean Space from the Cloud Control Panel, and create a folder called wp-content by clicking on New Folder.

From here, navigate back to the WordPress administration interface, and click into Media Library Folders Pro and then S3 & Spaces Settings in the sidebar.

The warning banner about configuring access keys should now have disappeared. If it’s still present, you should double check your wp-config.php file for any typos or syntax errors.

In the License Key text box, enter the license key that was emailed to you after purchasing the Spaces addon. Note that this license key is different from the MLFP license key. Hit Save Changes and then Activate License.

Once activated, you should see the following configuration pane:

MLFP Spaces Addon Configuration

From here, click Select Image Bucket & Region to select your DigitalOcean Space. Then select the correct region for your Space and hit Save Bucket Selection.

You’ve now successfully connected the Spaces offload plugin to your DigitalOcean Space. You can begin offloading your WordPress Media Library assets.

The Use files on the cloud server checkbox allows you to specify where Media Library assets will be served from. If you check the box, assets will be served from DigitalOcean Spaces, and URLs to images and other Media Library objects will be correspondingly rewritten. If you plan on using the Spaces CDN to serve your Media Library assets, do not check this box, as the plugin will use the Spaces Origin endpoint and not the CDN Edge endpoint. We will configure CDN link rewriting in a future step.

Click the Remove files from local server box to delete local Media Library assets once they’ve been successfully uploaded to DigitalOcean Spaces.

The Remove individual downloaded files from the cloud server checkbox should be used when bulk downloading files from Spaces to your WordPress server. If checked, these files will be deleted from Spaces after successfully downloading to your WordPress server. We can ignore this option for now.

Since we’re configuring the plugin for use with the Spaces CDN, leave the Use files on the cloud server box unchecked, and hit Copy Media Library to the cloud server to sync your site’s WordPress Media Library to your DigitalOcean Space.

You should see a progress box appear, and then Upload complete. indicating the Media Library sync has concluded successfully.

Navigate to your DigitalOcean Space to confirm that your Media Library files have been copied to your Space. They should be available in the uploads subdirectory of the wp-content directory you created earlier in this step.

Once your files are available in your Space, you’re ready to move on to configuring the Spaces CDN.

Installing CDN Enabler Plugin to Deliver Assets from Spaces CDN

To use the Spaces CDN to serve your now offloaded files, first ensure that you’ve enabled the CDN for your Space.

Once the CDN has been enabled for your Space, you can now install and configure the CDN Enabler WordPress plugin to rewrite links to your Media Library assets. The plugin will rewrite links to these assets so that they are served from the Spaces CDN endpoint.

To install CDN Enabler, you can either use the Plugins menu from the WordPress administration interface, or install the plugin directly from the command line. We’ll demonstrate the latter procedure here.

First, log in to your WordPress server. Then, navigate to your plugins directory:

  • cd /var/www/html/wp-content/plugins

Be sure to replace the above path with the path to your WordPress installation.

From the command line, use the wp-cli interface to install the plugin:

  • wp plugin install cdn-enabler

Now, activate the plugin:

  • wp plugin activate cdn-enabler

Back in the WordPress Admin Area, under Settings, you should see a new link to CDN Enabler settings. Click into CDN Enabler.

You should see the following settings screen:

CDN Enabler Settings

Modify the displayed fields as follows:

  • CDN URL: Enter the Spaces Edge endpoint, which you can find from the Spaces Dashboard. In this tutorial, this is https://wordpress-offload.nyc3.cdn.digitaloceanspaces.com
  • Included Directories: Enter wp-content/uploads. We’ll learn how to serve other wp-content directories in the Offload Additional Assets section.
  • Exclusions: Leave the default .php
  • Relative Path: Leave the box checked
  • CDN HTTPS: Enable it by checking the box
  • Leave the remaining two fields blank

Then, hit Save Changes to save these settings and enable them for your WordPress site.

At this point you’ve successfully offloaded your WordPress site’s Media Library to DigitalOcean Spaces and are serving them to end users using the CDN.

In this step, we did not offload the WordPress theme or other wp-content assets. To learn how to transfer these assets to Spaces and serve them using the Spaces CDN, skip to Offload Additional Assets.

To verify and test that your Media Library uploads are being delivered from the Spaces CDN, skip to Test CDN Caching.

Offloading Additional Assets (Optional)

In previous sections of this guide, we’ve learned how to offload our site’s WordPress Media Library to Spaces and serve these files using the Spaces CDN. In this section, we’ll cover offloading and serving additional WordPress assets like themes, JavaScript files, and fonts.

Most of these static assets live inside of the wp-content directory (which contains wp-themes). To offload and rewrite URLs for this directory, we’ll use CDN Enabler, an open-source plugin developed by KeyCDN.

If you’re using the WP Offload Media plugin, you can use the Asset Pull addon to serve these files using a pull CDN. Installing and configuring this addon is beyond the scope of this guide. To learn more, consult the DeliciousBrains product page.

First, we’ll install CDN Enabler. We’ll then copy our WordPress themes over to Spaces, and finally configure CDN Enabler to deliver these using the Spaces CDN.

If you’ve already installed CDN Enabler in a previous step, skip to Step 2.

Step 1 — Installing CDN Enabler

To install CDN Enabler, log in to your WordPress server. Then, navigate to your plugins directory:

  • cd /var/www/html/wp-content/plugins

Be sure to replace the above path with the path to your WordPress installation.

From the command line, use the wp-cli interface to install the plugin:

  • wp plugin install cdn-enabler

Now, activate the plugin:

  • wp plugin activate cdn-enabler

Back in the WordPress Admin Area, under Settings, you should see a new link to CDN Enabler settings. Click into CDN Enabler.

You should see the following settings screen:

CDN Enabler Settings

At this point you’ve successfully installed CDN Enabler. We’ll now upload our WordPress themes to Spaces.

Step 2 — Uploading Static WordPress Assets to Spaces

In this tutorial, to demonstrate a basic plugin configuration, we’re only going to serve wp-content/themes, the WordPress directory containing WordPress themes’ PHP, JavaScript, HTML, and image files. You can optionally extend this process to other WordPress directories, like wp-includes, and even the entire wp-content directory.

The theme used by the WordPress installation in this tutorial is twentyseventeen, the default theme for a fresh WordPress installation at the time of writing. You can repeat these steps for any other theme or WordPress content.

First, we’ll upload our theme to our DigitalOcean Space using s3cmd. If you haven’t yet configured s3cmd, consult the DigitalOcean Spaces Product Documentation.

Navigate to your WordPress installation’s wp-content directory:

  • cd /var/www/html/wp-content

From here, upload the themes directory to your DigitalOcean Space using s3cmd. Note that at this point you can choose to upload only a single theme, but for simplicity and to offload as much content as possible from our server, we will upload all the themes in the themes directory to our Space.

We’ll use find to build a list of non-PHP (therefore cacheable) files, which we’ll then pipe to s3cmd to upload to Spaces. We’ll exclude CSS stylesheets as well in this first command as we need to set the text/css MIME type when uploading them.

  • find themes/ -type f -not \( -name '*.php' -or -name '*.css' \) | xargs -I{} s3cmd put --acl-public {} s3://wordpress-offload/wp-content/{}

Here, we instruct find to search for files within the themes/ directory, and ignore .php and .css files. We then use xargs -I{} to iterate over this list, executing s3cmd put for each file, and set the file’s permissions in Spaces to public using --acl-public.

Next, we’ll do the same for CSS stylesheets, adding the --mime-type="text/css" flag to set the text/css MIME type for the stylesheets on Spaces. This will ensure that Spaces serves your theme’s CSS files using the correct Content-Type: text/css HTTP header:

  • find themes/ -type f -name '*.css' | xargs -I{} s3cmd put --acl-public --mime-type="text/css" {} s3://wordpress-offload/wp-content/{}

Again, be sure to replace wordpress-offload in the above command with your Space name.

Now that we’ve uploaded our theme, let’s verify that it can be found at the correct path in our Space. Navigate to your Space using the DigitalOcean Cloud Control Panel.

Enter the wp-content directory, followed by the themes directory. You should see your theme’s directory here. If you don’t, verify your s3cmd configuration and re-upload your theme to your Space.

Now that our theme lives in our Space, and we’ve set the correct metadata, we can begin serving its files using CDN Enabler and the DigitalOcean Spaces CDN.

Navigate back to the WordPress Admin Area and click into Settings and then CDN Enabler.

Here, modify the displayed fields as follows:

  • CDN URL: Enter the Spaces Edge endpoint, as done in Step 1. In this tutorial, this is https://wordpress-offload.nyc3.cdn.digitaloceanspaces.com
  • Included Directories: If you’re not using the MLFP plugin, this should be wp-content/themes. If you are, this should be wp-content/uploads,wp-content/themes
  • Exclusions: Leave the default .php
  • Relative Path: Leave the box checked
  • CDN HTTPS: Enable it by checking the box
  • Leave the remaining two fields blank

Your final settings should look something like this:

CDN Enabler Final Settings

Hit Save Changes to save these settings and enable them for your WordPress site.

At this point you’ve successfully offloaded your WordPress site’s theme assets to DigitalOcean Spaces and are serving them to end users using the CDN. We can confirm this using Chrome’s DevTools, following the procedure described below.

Using the CDN Enabler plugin, you can repeat this process for other WordPress directories, like wp-includes, and even the entire wp-content directory.

Testing CDN Caching

In this section, we’ll demonstrate how to determine where your WordPress assets are being served from (e.g. your host server or the CDN) using Google Chrome’s DevTools.

Step 1 — Adding Sample Image to Media Library to Test Syncing

To begin, we’ll first upload a sample image to our Media Library, and verify that it’s being served from the DigitalOcean Spaces CDN servers. You can upload an image using the WordPress Admin web interface, or using the wp-cli command-line tool. In this guide, we’ll use wp-cli to upload the sample image.

Log in to your WordPress server using the command line, and navigate to the home directory for the non-root user you’ve configured. In this tutorial, we’ll use the user sammy.

  • cd

From here, use curl to download the DigitalOcean logo to your Droplet (if you already have an image you’d like to test with, skip this step):

  • curl https://assets.digitalocean.com/logos/DO_Logo_horizontal_blue.png > do_logo.png

Now, use wp-cli to import the image to your Media Library:

  • wp media import --path=/var/www/html/ /home/sammy/do_logo.png

Be sure to replace /var/www/html with the correct path to the directory containing your WordPress files.

You may see some warnings, but the output should end in the following:

Output
Imported file '/home/sammy/do_logo.png' as attachment ID 10. Success: Imported 1 of 1 items.

Which indicates that our test image has successfully been copied to the WordPress Media Library, and also uploaded to our DigitalOcean Space, using your preferred offload plugin.

Navigate to your DigitalOcean Space to confirm:

Spaces Upload Success

This indicates that your offload plugin is functioning as expected and automatically syncing WordPress uploads to your DigitalOcean Space. Note that the exact path to your Media Library uploads in the Space will depend on the plugin you’re using to offload your WordPress files.

Next, we will verify that this file is being served using the Spaces CDN, and not from the server running WordPress.

Step 2 — Inspecting Asset URL

From the WordPress admin area (https://your_domain/wp-admin), navigate to Pages in the left-hand side navigation menu.

We will create a sample page containing our uploaded image to determine where it’s being served from. You can also run this test by adding the image to an existing page on your WordPress site.

From the Pages screen, click into Sample Page, or any existing page. You can alternatively create a new page.

In the page editor, click on Add Media, and select the DigitalOcean logo (or other image you used to test this procedure).

An Attachment Details pane should appear on the right-hand side of your screen. From this pane, add the image to the page by clicking on Insert into page.

Now, back in the page editor, click on either Publish (if you created a new sample page) or Update (if you added the image to an existing page) in the Publish box on the right-hand side of your screen.

Now that the page has successfully been updated to contain the image, navigate to it by clicking on the Permalink under the page title. You’ll be brought to this page in your web browser.

For the purposes of this tutorial, the following steps will assume that you’re using Google Chrome, but you can use most modern web browsers to run a similar test.

From the rendered page preview in your browser, right click on the image and click on Inspect:

Inspect Menu

A DevTools window should pop up, highlighting the img asset in the page’s HTML:

DevTools Output

You should see the CDN endpoint for your DigitalOcean Space in this URL (in this tutorial, our Spaces CDN endpoint is https://wordpress-offload.nyc3.cdn.digitaloceanspaces.com), indicating that the image asset is being served from the DigitalOcean Spaces CDN edge cache.

This confirms that your Media Library uploads are being synced to your DigitalOcean Space and served using the Spaces CDN.

Step 3 — Inspecting Asset Response Headers

From the DevTools window, we’ll run one final test. Click on Network in the toolbar at the top of the window.

Once in the blank Network window, follow the displayed instructions to reload the page.

The page assets should populate in the window. Locate your test image in the list of page assets:

Chrome DevTools Asset List

Once you’ve located your test image, click into it to open an additional information pane. Within this pane, click on Headers to show the response headers for this asset:

Response Headers

You should see the Cache-Control HTTP header, which is a CDN response header. This confirms that this image was served from the Spaces CDN.

Step 4 — Inspecting URLs for Theme Assets (Optional)

If you offloaded your wp-themes (or other) directory as described in Offload Additional Assets, you should perform the following brief check to verify that your theme’s assets are being served from the Spaces CDN.

Navigate to your WordPress site in Google Chrome, and right-click anywhere in the page. In the menu that appears, click on Inspect.

You’ll once again be brought to the Chrome DevTools interface.

Chrome DevTools Interface

From here, click into Sources.

In the left-hand pane, you should see a list of your WordPress site’s assets. Scroll down to your CDN endpoint, and expand the list by clicking the small arrow next to the endpoint name:

DevTools Site Asset List

Observe that your WordPress theme’s header image, JavaScript, and CSS stylesheet are now being served from the Spaces CDN.

Conclusion

In this tutorial, we’ve shown how to offload static content from your WordPress server to DigitalOcean Spaces, and serve this content using the Spaces CDN. In most cases, this should reduce bandwidth on your host infrastructure and speed up page loads for end users, especially those located further away geographically from your WordPress server.

We demonstrated how to offload and serve both Media Library and themes assets using the Spaces CDN, but these steps can be extended to further unload the entire wp-content directory, as well as wp-includes.

Implementing a CDN to deliver static assets is just one way to optimize your WordPress installation. Other plugins like W3 Total Cache can further speed up page loads and improve the SEO of your site. A helpful tool to measure your page load speed and improve it is Google’s PageSpeed Insights. Another helpful tool that provides a waterfall breakdown of request and response times as well as suggested optimizations is Pingdom.

To learn more about Content Delivery Networks and how they work, consult Using a CDN to Speed Up Static Content Delivery.

DigitalOcean Community Tutorials

Stack Abuse: Handling Unix Signals in Python

UNIX/Linux systems offer special mechanisms to communicate between each individual process. One of these mechanisms are signals, and belong to the different methods of communication between processes (Inter Process Communication, abbreviated with IPC).

In short, signals are software interrupts that are sent to the program (or the process) to notify the program of significant events or requests to the program in order to run a special code sequence. A program that receives a signal either stops or continues the execution of its instructions, terminates either with or without a memory dump, or even simply ignores the signal.

Although it is defined in the POSIX standard, the reaction actually depends on how the developer wrote the script and implemented the handling of signals.

In this article we explain what are signals, show you how to sent a signal to another process from the command line as well as processing the received signal. Among other modules, the program code is mainly based on the signal module. This module connects the according C headers of your operating system with the Python world.

An Introduction to Signals

On UNIX-based systems, there are three categories of signals:

  • System signals (hardware and system errors): SIGILL, SIGTRAP, SIGBUS, SIGFPE, SIGKILL, SIGSEGV, SIGXCPU, SIGXFSZ, SIGIO

  • Device signals: SIGHUP, SIGINT, SIGPIPE, SIGALRM, SIGCHLD, SIGCONT, SIGSTOP, SIGTTIN, SIGTTOU, SIGURG, SIGWINCH, SIGIO

  • User-defined signals: SIGQUIT, SIGABRT, SIGUSR1, SIGUSR2, SIGTERM

Each signal is represented by an integer value, and the list of signals that are available is comparably long and not consistent between the different UNIX/Linux variants. On a Debian GNU/Linux system, the command kill -l displays the list of signals as follows:

$   kill -l  1) SIGHUP   2) SIGINT   3) SIGQUIT  4) SIGILL   5) SIGTRAP  6) SIGABRT  7) SIGBUS   8) SIGFPE   9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM   16) SIGSTKFLT   17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP   21) SIGTTIN 22) SIGTTOU 23) SIGURG  24) SIGXCPU 25) SIGXFSZ   26) SIGVTALRM   27) SIGPROF 28) SIGWINCH    29) SIGIO   30) SIGPWR   31) SIGSYS  34) SIGRTMIN    35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3   38) SIGRTMIN+4  39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8   43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13   48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12   53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7   58) SIGRTMAX-6  59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2   63) SIGRTMAX-1  64) SIGRTMAX   

The signals 1 to 15 are roughly standardized, and have the following meaning on most of the Linux systems:

  • 1 (SIGHUP): terminate a connection, or reload the configuration for daemons
  • 2 (SIGINT): interrupt the session from the dialogue station
  • 3 (SIGQUIT): terminate the session from the dialogue station
  • 4 (SIGILL): illegal instruction was executed
  • 5 (SIGTRAP): do a single instruction (trap)
  • 6 (SIGABRT): abnormal termination
  • 7 (SIGBUS): error on the system bus
  • 8 (SIGFPE): floating point error
  • 9 (SIGKILL): immmediately terminate the process
  • 10 (SIGUSR1): user-defined signal
  • 11 (SIGSEGV): segmentation fault due to illegal access of a memory segment
  • 12 (SIGUSR2): user-defined signal
  • 13 (SIGPIPE): writing into a pipe, and nobody is reading from it
  • 14 (SIGALRM): the timer terminated (alarm)
  • 15 (SIGTERM): terminate the process in a soft way

In order to send a signal to a process in a Linux terminal you invoke the kill command with both the signal number (or signal name) from the list above and the id of the process (pid). The following example command sends the signal 15 (SIGTERM) to the process that has the pid 12345:

$   kill -15 12345 

An equivalent way is to use the signal name instead of its number:

$   kill -SIGTERM 12345 

Which way you choose depends on what is more convenient for you. Both ways have the same effect. As a result the process receives the signal SIGTERM, and terminates immediately.

Using the Python signal Library

Since Python 1.4, the signal library is a regular component of every Python release. In order to use the signal library, import the library into your Python program as follows, first:

import signal   

Capturing and reacting properly on a received signal is done by a callback function – a so-called signal handler. A rather simple signal handler named receiveSignal() can be written as follows:

def receiveSignal(signalNumber, frame):       print('Received:', signalNumber)     return 

This signal handler does nothing else than reporting the number of the received signal. The next step is registering the signals that are caught by the signal handler. For Python programs, all the signals (but 9, SIGKILL) can be caught in your script:

if __name__ == '__main__':       # register the signals to be caught     signal.signal(signal.SIGHUP, receiveSignal)     signal.signal(signal.SIGINT, receiveSignal)     signal.signal(signal.SIGQUIT, receiveSignal)     signal.signal(signal.SIGILL, receiveSignal)     signal.signal(signal.SIGTRAP, receiveSignal)     signal.signal(signal.SIGABRT, receiveSignal)     signal.signal(signal.SIGBUS, receiveSignal)     signal.signal(signal.SIGFPE, receiveSignal)     #signal.signal(signal.SIGKILL, receiveSignal)     signal.signal(signal.SIGUSR1, receiveSignal)     signal.signal(signal.SIGSEGV, receiveSignal)     signal.signal(signal.SIGUSR2, receiveSignal)     signal.signal(signal.SIGPIPE, receiveSignal)     signal.signal(signal.SIGALRM, receiveSignal)     signal.signal(signal.SIGTERM, receiveSignal) 

Next, we add the process information for the current process, and detect the process id using the methode getpid() from the os module. In an endless while loop we wait for incoming signals. We implement this using two more Python modules – os and time. We import them at the beginning of our Python script, too:

import os   import time   

In the while loop of our main program the print statement outputs “Waiting…”. The time.sleep() function call makes the program wait for three seconds.

    # output current process id     print('My PID is:', os.getpid())      # wait in an endless loop for signals      while True:         print('Waiting...')         time.sleep(3) 

Finally, we have to test our script. Having saved the script as signal-handling.py we can invoke it in a terminal as follows:

$   python3 signal-handling.py  My PID is: 5746   Waiting...   ... 

In a second terminal window we send a signal to the process. We identify our first process – the Python script – by the process id as printed on screen, above.

$   kill -1 5746 

The signal event handler in our Python program receives the signal we have sent to the process. It reacts accordingly, and simply confirms the received signal:

... Received: 1   ... 

Ignoring Signals

The signal module defines ways to ignore received signals. In order to do that the signal has to be connected with the predefined function signal.SIG_IGN. The example below demonstrates that, and as a result the Python program cannot be interrupted by CTRL+C anymore. To stop the Python script an alternative way has been implemented in the example script – the signal SIGUSR1 terminates the Python script. Furthermore, instead of an endless loop we use the method signal.pause(). It just waits for a signal to be received.

import signal   import os   import time  def receiveSignal(signalNumber, frame):       print('Received:', signalNumber)     raise SystemExit('Exiting')     return  if __name__ == '__main__':       # register the signal to be caught     signal.signal(signal.SIGUSR1, receiveSignal)      # register the signal to be ignored     signal.signal(signal.SIGINT, signal.SIG_IGN)      # output current process id     print('My PID is:', os.getpid())      signal.pause() 

Handling Signals Properly

The signal handler we have used up to now is rather simple, and just reports a received signal. This shows us that the interface of our Python script is working fine. Let’s improve it.

Catching the signal is already a good basis but requires some improvement to comply with the rules of the POSIX standard. For a higher accuracy each signal needs a proper reaction (see list above). This means that the signal handler in our Python script needs to be extended by a specific routine per signal. This works best if we understand what a signal does, and what a common reaction is. A process that receives signal 1, 2, 9 or 15 terminates. In any other case it is expected to write a core dump, too.

Up to now we have implemented a single routine that covers all the signals, and handles them in the same way. The next step is to implement an individual routine per signal. The following example code demonstrates this for the signals 1 (SIGHUP) and 15 (SIGTERM).

def readConfiguration(signalNumber, frame):       print ('(SIGHUP) reading configuration')     return  def terminateProcess(signalNumber, frame):       print ('(SIGTERM) terminating the process')     sys.exit() 

The two functions above are connected with the signals as follows:

    signal.signal(signal.SIGHUP, readConfiguration)     signal.signal(signal.SIGTERM, terminateProcess) 

Running the Python script, and sending the signal 1 (SIGHUP) followed by a signal 15 (SIGTERM) by the UNIX commands kill -1 16640 and kill -15 16640 results in the following output:

$   python3 daemon.py My PID is: 16640   Waiting...   Waiting...   (SIGHUP) reading configuration Waiting...   Waiting...   (SIGTERM) terminating the process 

The script receives the signals, and handles them properly. For clarity, this is the entire script:

import signal   import os   import time   import sys  def readConfiguration(signalNumber, frame):       print ('(SIGHUP) reading configuration')     return  def terminateProcess(signalNumber, frame):       print ('(SIGTERM) terminating the process')     sys.exit()  def receiveSignal(signalNumber, frame):       print('Received:', signalNumber)     return  if __name__ == '__main__':       # register the signals to be caught     signal.signal(signal.SIGHUP, readConfiguration)     signal.signal(signal.SIGINT, receiveSignal)     signal.signal(signal.SIGQUIT, receiveSignal)     signal.signal(signal.SIGILL, receiveSignal)     signal.signal(signal.SIGTRAP, receiveSignal)     signal.signal(signal.SIGABRT, receiveSignal)     signal.signal(signal.SIGBUS, receiveSignal)     signal.signal(signal.SIGFPE, receiveSignal)     #signal.signal(signal.SIGKILL, receiveSignal)     signal.signal(signal.SIGUSR1, receiveSignal)     signal.signal(signal.SIGSEGV, receiveSignal)     signal.signal(signal.SIGUSR2, receiveSignal)     signal.signal(signal.SIGPIPE, receiveSignal)     signal.signal(signal.SIGALRM, receiveSignal)     signal.signal(signal.SIGTERM, terminateProcess)      # output current process id     print('My PID is:', os.getpid())      # wait in an endless loop for signals      while True:         print('Waiting...')         time.sleep(3) 

Further Reading

Using the signal module and an according event handler it is relatively easy to catch signals. Knowing the meaning of the different signals, and to react properly as defined in the POSIX standard is the next step. It requires that the event handler distinguishes between the different signals, and has a separate routine for all of them.

Planet Python

How To Build a Neural Network to Recognize Handwritten Digits with TensorFlow

Introduction

Neural networks are used as a method of deep learning, one of the many subfields of artificial intelligence. They were first proposed around 70 years ago as an attempt at simulating the way the human brain works, though in a much more simplified form. Individual ‘neurons’ are connected in layers, with weights assigned to determine how the neuron responds when signals are propagated through the network. Previously, neural networks were limited in the number of neurons they were able to simulate, and therefore the complexity of learning they could achieve. But in recent years, due to advancements in hardware development, we have been able to build very deep networks, and train them on enormous datasets to achieve breakthroughs in machine intelligence.

These breakthroughs have allowed machines to match and exceed the capabilities of humans at performing certain tasks. One such task is object recognition. Though machines have historically been unable to match human vision, recent advances in deep learning have made it possible to build neural networks which can recognize objects, faces, text, and even emotions.

In this tutorial, you will implement a small subsection of object recognition—digit recognition. Using TensorFlow, an open-source Python library developed by the Google Brain labs for deep learning research, you will take hand-drawn images of the numbers 0-9 and build and train a neural network to recognize and predict the correct label for the digit displayed.

While you won’t need prior experience in practical deep learning or TensorFlow to follow along with this tutorial, we’ll assume some familiarity with machine learning terms and concepts such as training and testing, features and labels, optimization, and evaluation. You can learn more about these concepts in An Introduction to Machine Learning.

Prerequisites

To complete this tutorial, you’ll need:

Step 1 — Configuring the Project

Before you can develop the recognition program, you’ll need to install a few dependencies and create a workspace to hold your files.

We’ll use a Python 3 virtual environment to manage our project’s dependencies. Create a new directory for your project and navigate to the new directory:

  • mkdir tensorflow-demo
  • cd tensorflow-demo

Execute the following commands to set up the virtual environment for this tutorial:

  • python3 -m venv tensorflow-demo
  • source tensorflow-demo/bin/activate

Next, install the libraries you’ll use in this tutorial. We’ll use specific versions of these libraries by creating a requirements.txt file in the project directory which specifies the requirement and the version we need. Create the requirements.txt file:

  • touch requirements.txt

Open the file in your text editor and add the following lines to specify the Image, NumPy, and TensorFlow libraries and their versions:

requirements.txt
image==1.5.20 numpy==1.14.3 tensorflow==1.4.0 

Save the file and exit the editor. Then install these libraries with the following command:

  • pip install -r requirements.txt

With the dependencies installed, we can start working on our project.

Step 2 — Importing the MNIST Dataset

The dataset we will be using in this tutorial is called the MNIST dataset, and it is a classic in the machine learning community. This dataset is made up of images of handwritten digits, 28×28 pixels in size. Here are some examples of the digits included in the dataset:

Examples of MNIST images

Let’s create a Python program to work with this dataset. We will use one file for all of our work in this tutorial. Create a new file called main.py:

  • touch main.py

Now open this file in your text editor of choice and add this line of code to the file to import the TensorFlow library:

main.py
import tensorflow as tf 

Add the following lines of code to your file to import the MNIST dataset and store the image data in the variable mnist:

main.py
from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) # y labels are oh-encoded 

When reading in the data, we are using one-hot-encoding to represent the labels (the actual digit drawn, e.g. “3”) of the images. One-hot-encoding uses a vector of binary values to represent numeric or categorical values. As our labels are for the digits 0-9, the vector contains ten values, one for each possible digit. One of these values is set to 1, to represent the digit at that index of the vector, and the rest are set to 0. For example, the digit 3 is represented using the vector [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]. As the value at index 3 is stored as 1, the vector therefore represents the digit 3.

To represent the actual images themselves, the 28×28 pixels are flattened into a 1D vector which is 784 pixels in size. Each of the 784 pixels making up the image is stored as a value between 0 and 255. This determines the grayscale of the pixel, as our images are presented in black and white only. So a black pixel is represented by 255, and a white pixel by 0, with the various shades of gray somewhere in between.

We can use the mnist variable to find out the size of the dataset we have just imported. Looking at the num_examples for each of the three subsets, we can determine that the dataset has been split into 55,000 images for training, 5000 for validation, and 10,000 for testing. Add the following lines to your file:

main.py
n_train = mnist.train.num_examples # 55,000 n_validation = mnist.validation.num_examples # 5000 n_test = mnist.test.num_examples # 10,000 

Now that we have our data imported, it’s time to think about the neural network.

Step 3 — Defining the Neural Network Architecture

The architecture of the neural network refers to elements such as the number of layers in the network, the number of units in each layer, and how the units are connected between layers. As neural networks are loosely inspired by the workings of the human brain, here the term unit is used to represent what we would biologically think of as a neuron. Like neurons passing signals around the brain, units take some values from previous units as input, perform a computation, and then pass on the new value as output to other units. These units are layered to form the network, starting at a minimum with one layer for inputting values, and one layer to output values. The term hidden layer is used for all of the layers in between the input and output layers, i.e. those “hidden” from the real world.

Different architectures can yield drastically different results, as the performance can be thought of as a function of the architecture among other things, such as the parameters, the data, and the duration of training.

Add the following lines of code to your file to store the number of units per layer in global variables. This allows us to alter the network architecture in one place, and at the end of the tutorial you can test for yourself how different numbers of layers and units will impact the results of our model:

main.py
n_input = 784   # input layer (28x28 pixels) n_hidden1 = 512 # 1st hidden layer n_hidden2 = 256 # 2nd hidden layer n_hidden3 = 128 # 3rd hidden layer n_output = 10   # output layer (0-9 digits) 

The following diagram shows a visualization of the architecture we’ve designed, with each layer fully connected to the surrounding layers:

Diagram of a neural network

The term “deep neural network” relates to the number of hidden layers, with “shallow” usually meaning just one hidden layer, and “deep” referring to multiple hidden layers. Given enough training data, a shallow neural network with a sufficient number of units should theoretically be able to represent any function that a deep neural network can. But it is often more computationally efficient to use a smaller deep neural network to achieve the same task that would require a shallow network with exponentially more hidden units. Shallow neural networks also often encounter overfitting, where the network essentially memorizes the training data that it has seen, and is not able to generalize the knowledge to new data. This is why deep neural networks are more commonly used: the multiple layers between the raw input data and the output label allow the network to learn features at various levels of abstraction, making the network itself better able to generalize.

Other elements of the neural network that need to be defined here are the hyperparameters. Unlike the parameters that will get updated during training, these values are set initially and remain constant throughout the process. In your file, set the following variables and values:

main.py
learning_rate = 1e-4 n_iterations = 1000 batch_size = 128 dropout = 0.5 

The learning rate represents ow much the parameters will adjust at each step of the learning process. These adjustments are a key component of training: after each pass through the network we tune the weights slightly to try and reduce the loss. Larger learning rates can converge faster, but also have the potential to overshoot the optimal values as they are updated. The number of iterations refers to how many times we go through the training step, and the batch size refers to how many training examples we are using at each step. The dropout variable represents a threshold at which we elimanate some units at random. We will be using dropout in our final hidden layer to give each unit a 50% chance of being eliminated at every training step. This helps prevent overfitting.

We have now defined the architecture of our neural network, and the hyperparameters that impact the learning process. The next step is to build the network as a TensorFlow graph.

Step 4 — Building the TensorFlow Graph

To build our network, we will set up the network as a computational graph for TensorFlow to execute. The core concept of TensorFlow is the tensor, a data structure similar to an array or list. initialized, manipulated as they are passed through the graph, and updated through the learning process.

We’ll start by defining three tensors as placeholders, which are tensors that we’ll feed values into later. Add the following to your file:

main.py
X = tf.placeholder("float", [None, n_input]) Y = tf.placeholder("float", [None, n_output]) keep_prob = tf.placeholder(tf.float32)  

The only parameter that needs to be specified at its declaration is the size of the data we will be feeding in. For X we use a shape of [None, 784], where None represents any amount, as we will be feeding in an undefined number of 784-pixel images. The shape of Y is [None, 10] as we will be using it for an undefined number of label outputs, with 10 possible classes. The keep_prob tensor is used to control the dropout rate, and we initialize it as a placeholder rather than an immutable variable because we want to use the same tensor both for training (when dropout is set to 0.5) and testing (when dropout is set to 1.0).

The parameters that the network will update in the training process are the weight and bias values, so for these we need to set an initial value rather than an empty placeholder. These values are essentially where the network does its learning, as they are used in the activation functions of the neurons, representing the strength of the connections between units.

Since the values are optimized during training, we could set them to zero for now. But the initial value actually has a significant impact on the final accuracy of the model. We’ll use random values from a truncated normal distribution for the weights. We want them to be close to zero, so they can adjust in either a positive or negative direction, and slightly different, so they generate different errors. This will ensure that the model learns something useful. Add these lines:

main.py
weights = {     'w1': tf.Variable(tf.truncated_normal([n_input, n_hidden1], stddev=0.1)),     'w2': tf.Variable(tf.truncated_normal([n_hidden1, n_hidden2], stddev=0.1)),     'w3': tf.Variable(tf.truncated_normal([n_hidden2, n_hidden3], stddev=0.1)),     'out': tf.Variable(tf.truncated_normal([n_hidden3, n_output], stddev=0.1)), } 

For the bias, we use a small constant value to ensure that the tensors activate in the intial stages and therefore contribute to the propagation. The weights and bias tensors are stored in dictionary objects for ease of access. Add this code to your file to define the biases:

main.py
 biases = {     'b1': tf.Variable(tf.constant(0.1, shape=[n_hidden1])),     'b2': tf.Variable(tf.constant(0.1, shape=[n_hidden2])),     'b3': tf.Variable(tf.constant(0.1, shape=[n_hidden3])),     'out': tf.Variable(tf.constant(0.1, shape=[n_output])) } 

Next, set up the layers of the network by defining the operations that will manipulate the tensors. Add these lines to your file:

main.py
layer_1 = tf.add(tf.matmul(X, weights['w1']), biases['b1']) layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2']) layer_3 = tf.add(tf.matmul(layer_2, weights['w3']), biases['b3']) layer_drop = tf.nn.dropout(layer_3, keep_prob) output_layer = tf.matmul(layer_3, weights['out']) + biases['out'] 

Each hidden layer will execute matrix multiplication on the previous layer’s outputs and the current layer’s weights, and add the bias to these values. At the last hidden layer, we will apply a dropout operation using our keep_prob value of 0.5.

The final step in building the graph is to define the loss function that we want to optimize. A popular choice of loss function in TensorFlow programs is cross-entropy, also known as log-loss, which quantifies the difference between two probability distributions (the predictions and the labels). A perfect classification would result in a cross-entropy of 0, with the loss completely minimized.

We also need to choose the optimization algorithm which will be used to minimize the loss function. A process named gradient descent optimization is a common method for finding the (local) minimum of a function by taking iterative steps along the gradient in a negative (descending) direction. There are several choices of gradient descent optimization algorithms already implemented in TensorFlow, and in this tutorial we will be using the Adam optimizer. This extends upon gradient descent optimization by using momentum to speed up the process through computing an exponentially weighted average of the gradients and using that in the adjustments. Add the following code to your file:

main.py
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=output_layer)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) 

We’ve now defined the network and built it out with TensorFlow. The next step is to feed data through the graph to train it, and then test that it has actually learnt something.

Step 5 — Training and Testing

The training process involves feeding the training dataset through the graph and optimizing the loss function. Every time the network iterates through a batch of more training images, it updates the parameters to reduce the loss in order to more accurately predict the digits shown. The testing process involves running our testing dataset through the trained graph, and keeping track of the number of images that are correctly predicted, so that we can calculate the accuracy.

Before starting the training process, we will define our method of evaluating the accuracy so we can print it out on mini-batches of data while we train. These printed statements will allow us to check that from the first iteration to the last, loss decreases and accuracy increases; they will also allow us to track whether or not we have ran enough iterations to reach a consistent and optimal result:

main.py
correct_pred = tf.equal(tf.argmax(output_layer, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 

In correct_pred, we use the arg_max function to compare which images are being predicted correctly by looking at the output_layer (predictions) and Y (labels), and we use the equal function to return this as a list of [Booleans](tps://www.digitalocean.com/community/tutorials/understanding-data-types-in-python-3#booleans). We can then cast this list to floats and calculate the mean to get a total accuracy score.

We are now ready to initialize a session for running the graph. In this session we will feed the network with our training examples, and once trained, we feed the same graph with new test examples to determine the accuracy of the model. Add the following lines of code to your file:

main.py
init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) 

The essence of the training process in deep learning is to optimize the loss function. Here we are aiming to minimize the difference between the predicted labels of the images, and the true labels of the images. The process involves four steps which are repeated for a set number of iterations:

  • Propagate values forward through the network
  • Compute the loss
  • Propagate values backward through the network
  • Update the parameters

At each training step, the parameters are adjusted slightly to try and reduce the loss for the next step. As the learning progresses, we should see a reduction in loss, and eventually we can stop training and use the network as a model for testing our new data.

Add this code to the file:

main.py
# train on mini batches for i in range(n_iterations):     batch_x, batch_y = mnist.train.next_batch(batch_size)     sess.run(train_step, feed_dict={X: batch_x, Y: batch_y, keep_prob:dropout})      # print loss and accuracy (per minibatch)     if i%100==0:         minibatch_loss, minibatch_accuracy = sess.run([cross_entropy, accuracy], feed_dict={X: batch_x, Y: batch_y, keep_prob:1.0})         print("Iteration", str(i), "\t| Loss =", str(minibatch_loss), "\t| Accuracy =", str(minibatch_accuracy)) 

After 100 iterations of each training step in which we feed a mini-batch of images through the network, we print out the loss and accuracy of that batch. Note that we should not be expecting a decreasing loss and increasing accuracy here, as the values are per batch, not for the entire model. We use mini-batches of images rather than feeding them through individually to speed up the training process and allow the network to see a number of different examples before updating the parameters.

Once the training is complete, we can run the session on the test images. This time we are using a keep_prob dropout rate of 1.0 to ensure all units are active in the testing process.

Add this code to the file:

main.py
test_accuracy = sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels, keep_prob:1.0}) print("\nAccuracy on test set:", test_accuracy) 

It’s now time to run our program and see how accurately our neural network can recognize these handwritten digits. Save the main.py file and execute the following command in the terminal to run the script:

  • python3 main.py

You’ll see an output similar to the following, although individual loss and accuracy results may vary slightly:

Output
Iteration 0 | Loss = 3.67079 | Accuracy = 0.140625 Iteration 100 | Loss = 0.492122 | Accuracy = 0.84375 Iteration 200 | Loss = 0.421595 | Accuracy = 0.882812 Iteration 300 | Loss = 0.307726 | Accuracy = 0.921875 Iteration 400 | Loss = 0.392948 | Accuracy = 0.882812 Iteration 500 | Loss = 0.371461 | Accuracy = 0.90625 Iteration 600 | Loss = 0.378425 | Accuracy = 0.882812 Iteration 700 | Loss = 0.338605 | Accuracy = 0.914062 Iteration 800 | Loss = 0.379697 | Accuracy = 0.875 Iteration 900 | Loss = 0.444303 | Accuracy = 0.90625 Accuracy on test set: 0.9206

To try and improve the accuracy of our model, or to learn more about the impact of tuning hyperparameters, we can test the effect of changing the learning rate, the dropout threshold, the batch size, and the number of iterations. We can also change the number of units in our hidden layers, and change the amount of hidden layers themselves, to see how different architectures increase or decrease the model accuracy.

To demonstrate that the network is actually recognizing the hand-drawn images, let’s test it on a single image of our own.

First either download this sample test image or open up a graphics editor and create your own 28×28 pixel image of a digit.

Open the main.py file in your editor and add the following lines of code to the top of the file to import two libraries necessary for image manipulation.

main.py
import numpy as np from PIL import Image ... 

Then at the end of the file, add the following line of code to load the test image of the handwritten digit:

main.py
img = np.invert(Image.open("test_img.png").convert('L')).ravel()  

The open function of the Image library loads the test image as a 4D array containing the three RGB color channels and the Alpha transparency. This is not the same representation we used previously when reading in the dataset with TensorFlow, so we’ll need to do some extra work to match the format.

First, we use the convert function with the L parameter to reduce the 4D RGBA representation to one grayscale color channel. We store this as a numpy array and invert it using np.invert, because the current matrix represents black as 0 and white as 255, whereas we need the opposite. Finally, we call ravel to flatten the array.

Now that the image data is structured correctly, we can run a session in the same way as previously, but this time only feeding in the single image for testing. Add the following code to your file to test the image and print the outputted label.

main.py
prediction = sess.run(tf.argmax(output_layer,1), feed_dict={X: [img]}) print ("Prediction for test image:", np.squeeze(prediction)) 

The np.squeeze function is called on the prediction to return the single integer from the array (i.e. to go from [2] to 2). The resulting output demonstrates that the network has recognized this image as the digit 2.

Output
Prediction for test image: 2

You can try testing the network with more complex images –– digits that look like other digits, for example, or digits that have been drawn poorly or incorrectly –– to see how well it fares.

Conclusion

In this tutorial you successfully trained a neural network to classify the MNIST dataset with around 92% accuracy and tested it on an image of your own. Current state-of-the-art research achieves around 99% on this same problem, using more complex network architectures involving convolutional layers. These use the 2D structure of the image to better represent the contents, unlike our method which flattened all the pixels into one vector of 784 units. You can read more about this topic on the TensorFlow website, and see the research papers detailing the most accurate results on the MNIST website.

Now that you know how to build and train a neural network, you can try and use this implementation on your own data, or test it on other popular datasets such as the Google StreetView House Numbers, or the CIFAR-10 dataset for more general image recognition.

DigitalOcean Community Tutorials

How To Back Up and Restore a Kubernetes Cluster on DigitalOcean Using Heptio Ark

Introduction

Heptio Ark is a convenient backup tool for Kubernetes clusters that compresses and backs up Kubernetes objects to object storage. It also takes snapshots of your cluster’s Persistent Volumes using your cloud provider’s block storage snapshot features, and can then restore your cluster’s objects and Persistent Volumes to a previous state.

StackPointCloud’s DigitalOcean Ark Plugin allows you to use DigitalOcean block storage to snapshot your Persistent Volumes, and Spaces to back up your Kubernetes objects. When running a Kubernetes cluster on DigitalOcean, this allows you to quickly back up your cluster’s state and restore it should disaster strike.

In this tutorial we’ll set up and configure the Ark client on a local machine, and deploy the Ark server into our Kubernetes cluster. We’ll then deploy a sample Nginx app that uses a Persistent Volume for logging, and simulate a disaster recovery scenario.

Prerequisites

Before you begin this tutorial, you should have the following available to you:

On your local computer:

In your DigitalOcean account:

  • A DigitalOcean Kubernetes cluster, or a Kubernetes cluster (version 1.7.5 or later) on DigitalOcean Droplets
  • A DNS server running inside of your cluster. If you are using DigitalOcean Kubernetes, this is running by default. To learn more about configuring a Kubernetes DNS service, consult Customizing DNS Service from the official Kuberentes documentation.
  • A DigitalOcean Space that will store your backed-up Kubernetes objects. To learn how to create a Space, consult the Spaces product documentation.
  • An access key pair for your DigitalOcean Space. To learn how to create a set of access keys, consult How to Manage Administrative Access to Spaces.
  • A personal access token for use with the DigitalOcean API. To learn how to create a personal access token, consult How to Create a Personal Access Token.

Once you have all of this set up, you’re ready to begin with this guide.

Step 1 — Installing the Ark Client

The Heptio Ark backup tool consists of a client installed on your local computer and a server that runs in your Kubernetes cluster. To begin, we’ll install the local Ark client.

In your web browser, navigate to the Ark GitHub repo releases page, find the latest release corresponding to your OS and system architecture, and copy the link address. For the purposes of this guide, we’ll use an Ubuntu 18.04 server on an x86-64 (or AMD64) processor as our local machine.

Then, from the command line on your local computer, navigate to the temporary /tmp directory and cd into it:

  • cd /tmp

Use wget and the link you copied earlier to download the release tarball:

  • wget https://link_copied_from_release_page

Once the download completes, extract the tarball using tar (note the filename may differ depending on the current release version and your OS):

  • tar -xvzf ark-v0.9.6-linux-amd64.tar.gz

The /tmp directory should now contain the extracted ark binary as well as the tarball you just downloaded.

Verify that you can run the ark client by executing the binary:

  • ./ark --help

You should see the following help output:

Output
Heptio Ark is a tool for managing disaster recovery, specifically for Kubernetes cluster resources. It provides a simple, configurable, and operationally robust way to back up your application state and associated data. If you're familiar with kubectl, Ark supports a similar model, allowing you to execute commands such as 'ark get backup' and 'ark create schedule'. The same operations can also be performed as 'ark backup get' and 'ark schedule create'. Usage: ark [command] Available Commands: backup Work with backups client Ark client related commands completion Output shell completion code for the specified shell (bash or zsh) create Create ark resources delete Delete ark resources describe Describe ark resources get Get ark resources help Help about any command plugin Work with plugins restic Work with restic restore Work with restores schedule Work with schedules server Run the ark server version Print the ark version and associated image . . .

At this point you should move the ark executable out of the temporary /tmp directory and add it to your PATH. To add it to your PATH on an Ubuntu system, simply copy it to /usr/local/bin:

  • sudo mv ark /usr/local/bin/ark

You’re now ready to configure the Ark server and deploy it to your Kubernetes cluster.

Step 2 — Installing and Configuring the Ark Server

Before we deploy Ark into our Kubernetes cluster, we’ll first create Ark’s prerequisite objects. Ark’s prerequisites consist of:

  • A heptio-ark Namespace

  • The ark Service Account

  • Role-based access control (RBAC) rules to grant permissions to the ark Service Account

  • Custom Resources (CRDs) for the Ark-specific resources: Backup, Schedule, Restore, Config

A YAML file containing the specs for the above Kubernetes objects can be found in the official Ark Git repository. While still in the /tmp directory, download the Ark repo using git:

  • git clone https://github.com/heptio/ark.git

Once downloaded, navigate into the ark directory:

  • cd ark

The prerequisite resources listed above can be found in the examples/common/00-prereqs.yaml YAML file. We’ll create these resources in our Kubernetes cluster by using kubectl apply and passing in the file:

  • kubectl apply -f examples/common/00-prereqs.yaml

You should see the following output:

Output
customresourcedefinition.apiextensions.k8s.io/backups.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/schedules.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/restores.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/configs.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/downloadrequests.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/deletebackuprequests.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/podvolumebackups.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/podvolumerestores.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/resticrepositories.ark.heptio.com created customresourcedefinition.apiextensions.k8s.io/backupstoragelocations.ark.heptio.com created namespace/heptio-ark created serviceaccount/ark created clusterrolebinding.rbac.authorization.k8s.io/ark created

Now that we’ve created the necessary Ark Kubernetes objects in our cluster, we can download and install the Ark DigitalOcean Plugin, which will allow us to use DigitalOcean Spaces as a backupStorageProvider (for Kubernetes objects), and DigitalOcean Block Storage as a persistentVolumeProvider (for Persistent Volume backups).

Move back out of the ark directory and fetch the plugin from StackPointCloud’s repo using git:

  • cd ..
  • git clone https://github.com/StackPointCloud/ark-plugin-digitalocean.git

Move into the plugin directory:

  • cd ark-plugin-digitalocean

We’ll now save the access keys for our DigitalOcean Space as a Kubernetes Secret. First, open up the examples/credentials-ark file using your favorite editor:

  • nano examples/credentials-ark

Replace <AWS_ACCESS_KEY_ID> and <AWS_SECRET_ACCESS_KEY> with your Spaces access key and secret key:

examples/credentials-ark
[default] aws_access_key_id=your_spaces_access_key_here aws_secret_access_key=your_spaces_secret_key_here 

Save and close the file.

Now, create the cloud-credentials Secret using kubectl, inserting your Personal Access Token using the digitalocean_token data item:

  • kubectl create secret generic cloud-credentials \
  • --namespace heptio-ark \
  • --from-file cloud=examples/credentials-ark \
  • --from-literal digitalocean_token=your_personal_access_token

You should see the following output:

Output
secret/cloud-credentials created

To confirm that the cloud-credentials Secret was created successfully, you can describe it using kubectl:

  • kubectl describe secrets/cloud-credentials --namespace heptio-ark

You should see the following output describing the cloud-credentials secret:

Output
Name: cloud-credentials Namespace: heptio-ark Labels: <none> Annotations: <none> Type: Opaque Data ==== cloud: 115 bytes digitalocean_token: 64 bytes

We can now move on to creating an Ark Config object named default. To do this, we’ll edit a YAML configuration file and then create the object in our Kubernetes cluster.

Open examples/10-ark-config.yaml in your favorite editor:

  • nano examples/10-ark-config.yaml

Insert your Space’s name and region in the highlighted fields:

examples/10-ark-config.yaml
--- apiVersion: ark.heptio.com/v1 kind: Config metadata:   namespace: heptio-ark   name: default persistentVolumeProvider:   name: digitalocean backupStorageProvider:   name: aws   bucket: space_name_here   config:     region: space_region_here     s3ForcePathStyle: "true"     s3Url: https://space_region_here.digitaloceanspaces.com backupSyncPeriod: 30m gcSyncPeriod: 30m scheduleSyncPeriod: 1m restoreOnlyMode: false 

persistentVolumeProvider sets DigitalOcean Block Storage as the the provider for Persistent Volume backups. These will be Block Storage Volume Snapshots.

backupStorageProvider sets DigitalOcean Spaces as the provider for Kubernetes object backups. Ark will create a tarball of all your Kubernetes objects (or some, depending on how you execute it), and upload this tarball to Spaces.

When you’re done, save and close the file.

Create the object in your cluster using kubectl apply:

  • kubectl apply -f examples/10-ark-config.yaml

You should see the following output:

Output
config.ark.heptio.com/default created

At this point, we’ve finished configuring the Ark server and can create its Kubernetes deployment, found in the examples/20-deployment.yaml configuration file. Let’s take a quick look at this file:

  • cat examples/20-deployment.yaml

You should see the following text:

examples/20-deployment.yaml
--- apiVersion: apps/v1beta1 kind: Deployment metadata:   namespace: heptio-ark   name: ark spec:   replicas: 1   template:     metadata:       labels:         component: ark       annotations:         prometheus.io/scrape: "true"         prometheus.io/port: "8085"         prometheus.io/path: "/metrics"     spec:       restartPolicy: Always       serviceAccountName: ark       containers:         - name: ark           image: gcr.io/heptio-images/ark:latest           command:             - /ark           args:             - server           volumeMounts:             - name: cloud-credentials               mountPath: /credentials             - name: plugins               mountPath: /plugins             - name: scratch               mountPath: /scratch           env:             - name: AWS_SHARED_CREDENTIALS_FILE               value: /credentials/cloud             - name: ARK_SCRATCH_DIR               value: /scratch             - name: DIGITALOCEAN_TOKEN               valueFrom:                 secretKeyRef:                   key: digitalocean_token                   name: cloud-credentials       volumes:         - name: cloud-credentials           secret:             secretName: cloud-credentials         - name: plugins           emptyDir: {}         - name: scratch           emptyDir: {} 

We observe here that we’re creating a Deployment called ark that consists of a single replica of the gcr.io/heptio-images/ark:latest container. The Pod is configured using the cloud-credentials secret we created earlier.

Create the Deployment using kubectl apply:

  • kubectl apply -f examples/20-deployment.yaml

You should see the following output:

Output
deployment.apps/ark created

We can double check that the Deployment has been successfully created using kubectl get on the heptio-ark Namespace :

  • kubectl get deployments --namespace=heptio-ark

You should see the following output:

Output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE ark 1 1 1 0 8m

The Ark server Pod may not start correctly until you install the Ark DigitalOcean plugin. To install the ark-blockstore-digitalocean plugin, use the ark client we installed earlier:

  • ark plugin add quay.io/stackpoint/ark-blockstore-digitalocean:latest

You can specify the kubeconfig to use with the --kubeconfig flag. If you don’t use this flag, ark will check the KUBECONFIG environment variable and then fall back to the kubectl default (~/.kube/config).

At this point Ark is running and fully configured, and ready to back up and restore your Kubernetes cluster objects and Persistent Volumes to DigitalOcean Spaces and Block Storage.

In the next section, we’ll run a quick test to make sure that the backup and restore functionality works as expected.

Step 3 — Testing Backup and Restore Procedure

Now that we’ve successfully installed and configured Ark, we can create a test Nginx Deployment and Persistent Volume, and run through a backup and restore drill to ensure that everything is working properly.

The ark-plugin-digitalocean repository contains a sample Nginx deployment called nginx-pv.yaml.

Let’s take a quick look:

  • cat examples/nginx-pv.yaml

You should see the following text:

Output
--- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nginx-logs namespace: nginx-example labels: app: nginx spec: storageClassName: do-block-storage accessModes: - ReadWriteOnce resources: requests: storage: 5Gi --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: nginx-example spec: replicas: 1 template: metadata: labels: app: nginx spec: volumes: - name: nginx-logs persistentVolumeClaim: claimName: nginx-logs containers: - image: nginx:1.7.9 name: nginx ports: - containerPort: 80 volumeMounts: - mountPath: "/var/log/nginx" name: nginx-logs readOnly: false --- apiVersion: v1 kind: Service metadata: labels: app: nginx name: my-nginx namespace: nginx-example spec: ports: - port: 80 targetPort: 80 selector: app: nginx type: LoadBalancer

In this file, we observe specs for:

  • An Nginx Deployment consisting of a single replica of the nginx:1.7.9 container image
  • A 5Gi Persistent Volume Claim (called nginx-logs), using the do-block-storage StorageClass
  • A LoadBalancer Service that exposes port 80

Create the deployment using kubectl apply:

  • kubectl apply -f examples/nginx-pv.yml

You should see the following output:

Output
namespace/nginx-example created persistentvolumeclaim/nginx-logs created deployment.apps/nginx-deployment created service/my-nginx created

Check that the Deployment succeeded:

  • kubectl get deployments --namespace=nginx-example

You should see the following output:

Output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE nginx-deployment 1 1 1 1 1h

Once Available reaches 1, fetch the Nginx load balancer’s external IP using kubectl get:

  • kubectl get services --namespace=nginx-example

You should see both the internal CLUSTER-IP and EXTERNAL-IP for the my-nginx Service:

Output
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx LoadBalancer 10.32.27.0 203.0.113.0 80:30754/TCP 3m

Note the EXTERNAL-IP and navigate to it using your web browser.

You should see the following NGINX welcome page:

Nginx Welcome Page

This indicates that your Nginx Deployment and Service are up and running.

Before we simulate our disaster scenario, let’s first check the Nginx access logs (stored on a Persistent Volume attached to the Nginx Pod):

Fetch the Pod’s name using kubectl get:

  • kubectl get pods --namespace nginx-example
Output
NAME READY STATUS RESTARTS AGE nginx-deployment-77d8f78fcb-zt4wr 1/1 Running 0 29m

Now, exec into the running Nginx container to get a shell inside of it:

  • kubectl exec -it nginx-deployment-77d8f78fcb-zt4wr --namespace nginx-example -- /bin/bash

Once inside the Nginx container, cat the Nginx access logs:

  • cat /var/log/nginx/access.log

You should see some Nginx access entries:

Output
10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-" 10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET /favicon.ico HTTP/1.1" 404 570 "http://203.0.113.0/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-"

Note these down (especially the timestamps), as we will use them to confirm the success of the restore procedure.

We can now perform the backup procedure to copy all nginx Kubernetes objects to Spaces and take a Snapshot of the Persistent Volume we created when deploying Nginx.

We’ll create a backup called nginx-backup using the ark client:

  • ark backup create nginx-backup --selector app=nginx

The --selector app=nginx instructs the Ark server to only back up Kubernetes objects with the app=nginx Label Selector.

You should see the following output:

Output
Backup request "nginx-backup" submitted successfully. Run `ark backup describe nginx-backup` for more details.

Running ark backup describe nginx-backup should provide the following output after a short delay:

Output
Name: nginx-backup Namespace: heptio-ark Labels: <none> Annotations: <none> Phase: Completed Namespaces: Included: * Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: app=nginx Snapshot PVs: auto TTL: 720h0m0s Hooks: <none> Backup Format Version: 1 Started: 2018-09-26 00:14:30 -0400 EDT Completed: 2018-09-26 00:14:34 -0400 EDT Expiration: 2018-10-26 00:14:30 -0400 EDT Validation errors: <none> Persistent Volumes: pvc-e4862eac-c2d2-11e8-920b-92c754237aeb: Snapshot ID: 2eb66366-c2d3-11e8-963b-0a58ac14428b Type: ext4 Availability Zone: IOPS: <N/A>

This output indicates that nginx-backup completed successfully.

From the DigitalOcean Cloud Control Panel, navigate to the Space containing your Kubernetes backup files.

You should see a new directory called nginx-backup containing the Ark backup files.

Using the left-hand navigation bar, go to Images and then Snapshots. Within Snapshots, navigate to Volumes. You should see a Snapshot corresponding to the PVC listed in the above output.

We can now test the restore procedure.

Let’s first delete the nginx-example Namespace. This will delete everything in the Namespace, including the Load Balancer and Persistent Volume:

  • kubectl delete namespace nginx-example

Verify that you can no longer access Nginx at the Load Balancer endpoint, and that the nginx-example Deployment is no longer running:

  • kubectl get deployments --namespace=nginx-example
Output
No resources found.

We can now perform the restore procedure, once again using the ark client:

  • ark restore create --from-backup nginx-backup

Here we use create to create an Ark Restore object from the nginx-backup object.

You should see the following output:

Output
  • Restore request "nginx-backup-20180926143828" submitted successfully.
  • Run `ark restore describe nginx-backup-20180926143828` for more details.

Check the status of the restored Deployment:

  • kubectl get deployments --namespace=nginx-example
Output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE nginx-deployment 1 1 1 1 1m

Check for the creation of a Persistent Volume:

  • kubectl get pvc --namespace=nginx-example
Output
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nginx-logs Bound pvc-e4862eac-c2d2-11e8-920b-92c754237aeb 5Gi RWO do-block-storage 3m

Navigate to the Nginx Service’s external IP once again to confirm that Nginx is up and running.

Finally, check the logs on the restored Persistent Volume to confirm that the log history has been preserved post-restore.

To do this, once again fetch the Pod’s name using kubectl get:

  • kubectl get pods --namespace nginx-example
Output
NAME READY STATUS RESTARTS AGE nginx-deployment-77d8f78fcb-zt4wr 1/1 Running 0 29m

Then exec into it:

  • kubectl exec -it nginx-deployment-77d8f78fcb-zt4wr --namespace nginx-example -- /bin/bash

Once inside the Nginx container, cat the Nginx access logs:

  • cat /var/log/nginx/access.log
Output
10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-" 10.244.17.1 - - [01/Oct/2018:21:47:01 +0000] "GET /favicon.ico HTTP/1.1" 404 570 "http://203.0.113.0/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/203.0.113.11 Safari/537.36" "-"

You should see the same pre-backup access attempts (note the timestamps), confirming that the Persistent Volume restore was successful. Note that there may be additional attempts in the logs if you visited the Nginx landing page after you performed the restore.

At this point, we’ve successfully backed up our Kubernetes objects to DigitalOcean Spaces, and our Persistent Volumes using Block Storage Volume Snapshots. We simulated a disaster scenario, and restored service to the test Nginx application.

Conclusion

In this guide we installed and configured the Ark Kubernetes backup tool on a DigitalOcean-based Kubernetes cluster. We configured the tool to back up Kubernetes objects to DigitalOcean Spaces, and back up Persistent Volumes using Block Storage Volume Snapshots.

Ark can also be used to schedule regular backups of your Kubernetes cluster. To do this, you can use the ark schedule command. It can also be used to migrate resources from one cluster to another. To learn more about these two use cases, consult the official Ark documentation.

To learn more about DigitalOcean Spaces, consult the official Spaces documentation. To learn more about Block Storage Volumes, consult the Block Storage Volume documentation.

This tutorial builds on the README found in StackPointCloud’s ark-plugin-digitalocean GitHub repo.

DigitalOcean Community Tutorials

Python Engineering at Microsoft: Python in Visual Studio Code – November 2018 Release

We are pleased to announce that the November 2018 release of the Python Extension for Visual Studio Code is now available. You can download the Python extension from the Marketplace, or install it directly from the extension gallery in Visual Studio Code. You can learn more about Python support in Visual Studio Code in the documentation.

This release was a quality focused release, we have closed a total of 28 issues, improving startup performance and fixing various bugs related to interpreter detection and Jupyter support. Keep on reading to learn more!

Improved Python Extension Load Time

We have started using webpack to bundle the TypeScript files in the extension for faster load times, this has significantly improved the extension download size, installation time and extension load time.  You can see the startup time of the extension by running the Developer: Startup Performance command. Below shows before and after times of extension loading (measured in milliseconds):

One downside to this approach is that reporting & troubleshooting issues with the extension is harder as the call stacks output by the Python extension are minified. To address this we have added the Python: Enable source map support for extension debugging command. This command will load source maps for for better error log output. This slows down load time of the extension, so we provide a helpful reminder to disable it every time the extension loads with source maps enabled:

These download, install, and startup performance improvements will help you get to writing your Python code faster, and we have even more improvements planned for future releases.

Other Changes and Enhancements

We have also added small enhancements and fixed issues requested by users that should improve your experience working with Python in Visual Studio Code. The full list of improvements is listed in our changelog; some notable changes include:

  • Update Jedi to 0.13.1 and parso 0.3.1. (#2667)
  • Make diagnostic message actionable when opening a workspace with no currently selected Python interpreter. (#2983)
  • Fix problems with virtual environments not matching the loaded python when running cells. (#3294)
  • Make nbconvert in a installation not prevent notebooks from starting. (#3343)

Be sure to download the Python extension for Visual Studio Code now to try out the above improvements. If you run into any problems, please file an issue on the Python VS Code GitHub page.

Planet Python