The Importance of Monitoring

Last updated November, 2016
tags: monitoring, uptime, IDS, configuration

If you have recently setup a new site, or are contemplating doing so, and you have not already given thought to monitoring, please read this; it is for you.

There are practical upsides to monitoring that take very little time to get started with and have tremendous business value. Many excellent posts on the subject tend to veer towards the technical, which is understandable given the technical nature of the web, but getting a sense of the importance of monitoring needn't be an overly technical activity.

In fact, for many of these points, there is a clear "build" or "buy" trade-off, and with monitoring there are such economies of scale available that in 90% of circumstances the answer is to buy—for very little cost you gain tremendous infrastructure and expertise.

The Basics: Access Logs

Access logs are the lines of text dumped from your server into a file with the date, time, address, request, and other information. The bare minimum required to monitor a website is to keep access logs. There's nothing controversial in saying so. Every significant web server we've ever seen includes logging capabilities, and getting comfortable finding, reading, and searching through those logs is the foundation of being a responsible system administrator.

Access logs are the primary source of truth for who has requested what and when, and should be of self-evident importance to anyone who goes through the trouble of authoring a web site in the first place.

If you know your way around a shell, the only thing stopping you from having baseline logging and monitoring is your time management skill set.

However, when either the security or DevOps communities start with logging, we take for granted the amount of time people have to actually review these logs. Further, not all teams have a dedicated system administrator.

In truth, getting to the point where you see your logs in aggregate on a graph is really what's important for most businesses.

Setup a spare monitor and have them showing somewhere you check everyday.

Your next priority should be an alerting setup that tells you when those lines start acting strangely.

If you have at least a programmer on hand, you should be able start doing something more dashboard-friendly with those logs right away. Services like Loggly, Elastic or let you get up and running fairly quickly, and have pricing tiers that should be reasonable for most small teams—and if you have too much traffic for the lower tiers, you hopefully have some means of converting that to revenue to pay for the higher tiers.

If technical muscle isn't your restriction, setting up your own ELK stack is certainly a possibility, though really we are of the opinion that the maintenance of such a stack falls squarely in the "buy" category for most organizations.

NB. Although user analytics tracking also tells you who accessed your site and from where (eg. see Google Analytics below), that is really not the same thing as access logs, and you should not confuse the two. Useful analytics systems show you a heavily filtered version of the world, and won't alert you to many problems.

Uptime and Performance

So you've gone to trouble of creating a web site. Perhaps you take it for granted that the site will always be up and available? If you've paid a hosting provider, much of that responsibility falls on their heads, however you still have to keep them accountable to their SLA, and if you have a backup plan for when your hosting provider eventually does go down, you'll need to be alerted to when it actually happens.

If you are hosting your own site, we trust that you know well enough the risks of a crash and already see the benefit of uptime alerting.

Uptime alerting is as simple as pinging your site every minute to make sure it is still responding. If you are thinking, "That's easy! I can do that with a small cron job", you are correct! If you don't know what cron is, or come to realize that you might care about whether people in other parts of the world can also ping your site, you'll likely come across a monitoring service like Pingdom, or maybe PagerDuty.

There are also services that can report how well the site is performing, network-wise, which can matter a great deal to anyone in Ecommerce, where customers' perception of performance goes a long way towards influencing their buying behaviour.

The good news is that these services can be quick to setup, and do pretty much exactly what they say on the box. Both Pingdom and PagerDuty have starter tiers that are comfortably in the small-business range, and you get tremendous value from this very simple monitor.

The bad news... isn't really that bad. For highly specific types of configuration it can take some mucking about to make sure you are monitoring what you need to be, but we all need to be monitoring uptime, and these services shine there.

Configuration Monitoring is a configuration and security monitoring service, so clearly we have some thoughts here: If you've gone to the trouble of setting up HTTPS correctly and securely, then you should care that it stays that way. Monitoring is realistically the only way to do that for a few reasons:

Certificate Expiry: Thankfully most CAs will send you an email letting you know this is happening, based on the time you registered and when their records say your time is approaching. Hopefully you also put it on your calendar. However, knowing that a certificate has expired doesn't tell you where that certificate has been deployed, and which services are using it. Monitoring your domains lets you be alerted when each is at risk.

Header Misconfiguration: If you have active development going on with your web server, or if you're just using the administrative console to "turn on secure headers", you are making changes that may impact visitors, and you need to know what those are. Also consider scenarios where other team members make these changes without proper consultation; digging into the headers returned from a web server is beyond the cursory scan most people give their site, and it might be too late by the time you notice.

Cipher Misconfiguration: Your team is less likely to change these than your hosting provider is, but if you run your own boxes this should definitely be on your radar. Cipher choice matters for a variety of reasons, but simply being alerted to changes and knowing what yours are is a good first step. Also, when new vulnerabilities emerge, having a monitoring tool that alerts you if you have suddenly become vulnerable is important.

Of course, security configuration is not the only type you should be monitoring. SEO and actual user behaviour is key knowing that your site is working for the users you spend most of your time thinking about.

If you haven't already, signing up for a service like Google Analytics will take a great deal of stress off of your shoulders. Of course the learning curve is a bit steep, and there are professionals in the world who do nothing but setup and interpret GA for companies, but even with a minimal effort you can get useful user behaviour, which is clearer than what you can derive from access logs (though again, they serve different purposes and should not be confused for each other).

The importance of setting up something like GA really comes to the fore when we talk about link configuration, by which I mean that annoying agreement of the web that says that if you publish a link /this/is/a/cool/article/ you will agree to keep it there while the rest of the world merrily links to it. Should you move it, you should do so with a redirect to where it or related material may be found (eg. 301 /archive/this/was/a/cool/article/). A shocking number of web administrators seem to have forgotten this lesson lately, but if you care about either your users or your SEO you will find a way of being alerted to this type of slip.

GA can alert you to such a change, though it's really not the perfect tool for the job, and we haven't found a better or more robust one. Still it is a start.

Parting Remarks

Great artists may ship, but masters make if float.* Sadly, with most things in life we all seem to have an attention budget, and design and launch eat up most of it. Once a site is live, most of us conclude that all the value is there and everything else is window dressing. This is true precisely until it isn't. Hopefully you don't wait until your first unnoticed outage to get started.

*Intentionally mixed metaphors


Want more?

Subscribe to get new resources like this one, and to be notified of updates.

Subscribed! An email confirmation has been sent.