This article first appeared in issue 222 of .net magazine - the world's best-selling magazine for web designers and developers.
Over the past three issues we’ve learnt about the nuts and bolts of hosting in the cloud. We’ve seen how you can spin up instances in a script, deal with failure and start building a really resilient and scalable architecture with some pretty cheap services.
But things do get complex quickly. Once you’re running just a few instances, you need to manage them and understand how to make changes to the architecture when you’re in production.
Very quickly, the simple turn-off-and-on-able cloud starts to feel like a big, complex server and you wonder why you ever bothered. But there is a better way.
This month we’ll look at how to move to the cloud, what things you have to do and how to get the right tools and the right help.
Where is the cloud going?
Last year saw a rise in interest in migration to the cloud, and this year is seeing a rise in actual migrations. But many of these migrations are starting slowly rather than leaping in because companies aren’t willing to throw everything they have into something they don’t know about.
But there’s a push for it. An increasing number of people from website managers through to designers, IT support staff and developers are being asked by their boss: so, how do we use this cloud thing?
Some people are spinning things up themselves, and doing quite nicely but it’s easy to fall into one of the bear traps. An instance is not a server, and treating it as one will only lead to tears.
You don’t replace each physical box with a physical server. You have to be a little bit clever.
But there’s still this push. If you’ve been in the web dev world for more than a couple of years you’ll recognise this phase. Remember questions coming from managers: “So, how do we use a CMS then?” and “What’s your plan for HTML5”? The cloud question is now circulating more and more.
The first step is to know what cloud hosting is good at and what difference it makes to your site and your end users. (Your manager would call this business benefits!)
What’s it for? What does it give you?
Let’s answer the first question that people throw: my server is running fine and hasn’t crashed in five years. What’s the point?
Your first deployment to a cloud instance isn’t going to wow anyone – it runs just the same. But as you push further into the cloud, and use more cloud technologies you get more out of it.
It’s a big collaborative world out there these days. Everyone is crawling all over everyone else’s apps, working together to make the web a much better place.
In this kind of world, being able to copy your entire site, assets, DB and all for another agency to work on in a few clicks makes all the difference. We’ve frequently created clones of live web environment for a new developer or agency to get something working saving us hours of configuration and fiddling around.
And what about all the dev servers that sit around all night long? Kill ’em. We wrote a tool that puts all our dev servers to sleep after 7pm, unless someone is working on them. Cloud management tools such as Scalr allow you to schedule server to come up and down.
Deploying a simple Facebook app can be done in minutes, and resized as your app gets more users. Just fire up an EC2 instance and resize with the traffic. Or, if it’s a little more complex use another of the platform-as-a-service (PaaS) bits of the cloud.
The fact is that as web agencies and companies adopt these tools, you’ll be left behind unless you adopt them too.
There are two typical scenarios we see: you have a big legacy system and the servers are grinding to a halt; and, you’re building a new site and want the hosting to be the best.
If it’s a legacy system, you can start by simply firing up a few instances with GoGrid, Amazon, OVH or whoever you prefer. If you want to run like- for-like on the cloud on a legacy system that hasn’t changed in years then it’s well worth “bundling” the app into an “image”.
This means creating a snapshot of a cloud server, which can be started in just one click and will contain everything the app needs to be able to run. It’s the fastest and easiest way to launch servers so suits legacy apps that can’t be tweaked really well.
If you can tweak the app a little, then start with tools such as New Relic, xhprof for PHP or any similar profiling tool. If you need to scale, bottlenecks don’t always make the app fall over: sometimes they can cost you a hell of a lot. This step saves you a fortune.
A reasonable, but by no means aggressive, benchmark is that the page should take no longer to render than 500ms. Anything over this tends to feel sluggish and the user starts noticing the page load times.
Shaving time off your app can be done by anything from rewriting slow methods to completely replacing entire sections of the page – how often is there something on the page that’s never clicked? What a waste of computing cycles.
If you don’t feel comfortable at this level, make sure your devs do or wheel in someone to help. We’ve been partners with a couple of cloud providers for a while so we get the chance to solve these problems all the time ... it’s amazing what improvements can be made to third-party code in a matter of hours.
If the app is new, the rule is: don’t think about the hosting last. Hosting and the cloud services you use are part of the app in the same way that Google Analytics is. If you’re dealing with uploaded files, stick them on static hosting. If it’s emailing, use a cloud email service.
Cloud services should always be “up”. Should, but they might not. If you want absolute 100% uptime you need to guarantee it by handing over a little cash.
How “up” do you want it?
Don’t mistake “my server has been up for three years” for “it will be up for three years”. That’s the turkey farmer fallacy: the turkey gets fed every day and thinks that this will last forever, then one day the farmer kills the turkey.
Cloud instances are designed to be cheap to run and cheap to fail, which means that cloud providers are less concerned by each individual instance than the overall cloud.
This sounds a bit strange, but the upshot is simple: to guarantee uptime you have to build in failover, redundancy and ensure that at all points in your system will carry on just fine if something fails.
A simple way of designing this is to draw a diagram of the architecture you’re planning and point to everything and ask “What happens when that fails?”. Point to each instance, point to each service so that you have a plan.
Here’s an example: What happens if the S3 bucket is deleted? The answer: we need to detect this and recover all the assets from somewhere else.
What happens if the load balancer dies? The answer: either have two or detect it and bring up a new one.
And so we move to the next stage – reworking your application to be better in the cloud.
Cheating on the cloud
This stage starts easy. We cheat a little. Said another way, you need to use some services that do what you do, just better.
That might not sound like a huge boon, but I’ve seen sites with large number of sizable assets shift everything static to an S3 and reducing the load drop on the server dramatically.
Many CMSes have support for uploading to static hosts built in, but if they don’t it can be built in with very little coding. Magento, for example, has a third-party plug-in, which we recently modified to ensure each frontend web server didn’t do any image resizing; it’s all done on upload. That drops the work of the web servers even further.
A second cheat you can do is for the database. Third-party database services such as Xeround, database.com and Amazon’s RDS give you a managed database with guaranteed uptimes and scalability. Database admin is fun but not for everyone, so if this isn’t your bag then use a service that does it better.
Xeround is a cloud database server that handles the scaling and failover for you. These aspects of hosting can be the hardest to achieve if you don’t have some serious, hardcore database admins working for you.
By simply signing up to Xeround, pointing your database at them and starting your app, one of the most common bottlenecks is no longer a problem.
A smaller, different shape
Big servers aren’t always best. Sometimes smaller is much more beautiful, and in hosting it pays to have more small servers than fewer big ones.
Here’s how it works: if you can afford one server that’s big or four that are small, which do you go for? You go for the four small ones because then if one server fails it doesn’t matter as much.
So once you have your app on small, stable servers, what comes next? Restructuring it.
This involves moving things, looking at what your app is really doing and sharing the work around better so it’s more efficient. See ‘Lazy Web Servers’ left, for an example that’s far too common.
The upshot is that the web servers run faster because their requests run through more quickly and the slow work of connecting to APIs and doing any data crunching doesn’t slow down the live website.
This works in reverse. If you’re importing from Flickr, YouTube, Facebook, Twitter or any one of the billions of places you might be grabbing data from then don’t run this on web servers because they have enough to do. And here’s the thing that will swing it: it often means you can spend less on the servers as a result. Much less.
Build it up, knock it down
Another essential step is good build management. This isn’t strictly part of good cloud hosting, it’s just something that becomes more apparent if it takes you less time to start a server than make a build of your site.
To get your house in order you need a fully buildable app, which can be run in one line:
This has to set up the code, grab external libraries and install the database from a snapshot. Do this in one hit because as your cloud server farm scales up and down, and reacts to any failures by launching new instances, every single instance has to start up with your app running on it.
Who’s in charge?
Well you are, but you don’t want to sit up all night checking on the servers and starting new ones if the old ones fail. You choose a tool to do that for you. Tools such as RightScale, Scalr, Enstratus and Kaavo allow you to design a cloud through a GUI, set up scaling rules and automate the behaviour of the cloud.
Batch updates can be run across the cloud instances without logging into each one, and detailed monitoring tells you what’s going on in your cloud farm.
But that isn’t the only way. Amazon offers CloudFormation, which encapsulates all your instances, load balancers, storage and scaling rules in one file so you can edit this in version control.
And there’s more. You can write your own scripts that start and stop instances, monitor the user traffic and scale according to demand. Scripts can deal with unusual cases better than out-of-the-box systems do simply because they can do anything.
Now, setting this up takes a little time. The administration of servers is often the most costly component when you start out, and it’s the cost that people forget to budget for. You can launch 10 instances in under a minute, but it takes somewhat longer to set up the management of them.
The rule here: don’t forget the management.
A lot of people who have used some cloud services tend to often to stick with what they’ve tried. Any cloud architect worth their salt will be able to prototype new methods alongside old ones and give you a quantitative reason to choose one, such as:
- It makes the site faster, and that gets us more revenue
- It’s more resilient
- It scales faster
Cloud hosting and computing is moving at such a pace that you should always hook up with a company that does this day to day. Maybe at a local co-working place, or the local user group, or simply on a retainer.
Hosting is beyond being a big box of disks and silicon. Hosting is software now, and it always pays to be working on the latest and best version.
Proud sponsors of our special cloud series
Why choose OVH?
Immediately accessible resources, full hardware availability, flexible infrastructures... With cloud computing, OVH has created the future of internet hosting. Companies get secure and reliable solutions at their fingertips that are closely aligned to their economic and structural needs. In minutes, you can now have the use of a real datacentre or benefit from flexible hosting. Reliability is second to none, with an availability rate of 99.99%.
To guarantee these results, OVH hasn’t had to compromise its infrastructure – in fact, all physical resources are doubled, whether they’re servers, storage spaces or network hardware. Nor has it affected prices, which are some of the lowest on the market. Visit ovh.co.uk or call 020 7357 6616 for details.