Sunday 24 February 2013

Azure downtime

Azure's storage system took a turn for the worse this weekend, reportedly because they forgot to renew the SSL certificate for the storage services.

Just in the comments section of this article, there are many comments chastising Microsoft for making such an amateur mistake (and rightly so). But there are also many who use this incident as a reason to write off cloud computing as a whole.

Letting a certificate expire is a massive cock-up, and as a customer I fully expect to see a report on the whys,hows, and what we'll do to stop it happening again from MS. However, let's not kid ourselves that cock-ups don't happen when we host and maintain these services ourselves. I work for a small company that couldn't afford to build and maintain equivalents of the Azure services ourselves. While it is frustrating and laughable that mistakes like this happen, at least when they happen on Azure, there is a whole team of highly intelligent well-paid developers and administrators working on solving the problem and preventing it from occurring again. In the on-site hosted premise, there's me. Now I am certain that if we were self-hosting, I'd be able to solve any problem that came my way although it would almost certainly take longer to fix purely due to available man hours, but I don't always have the time and resources to put in the necessary work to prevent it happening again, especially if it is something that is unlikely to recur.

If anything, incidents like this only serve to reinforce the message that cloud computing is not the silver bullet that it is often portrayed to be. It is not a suitable platform for everything nor is it devoid of fault or error. Just as you take the added expensive of the hardware, staffing, and management when hosting on-site, you also need to accept that things are to an extent out of your control when you move to cloud. Either way downtime will happen, and it will happen because someone made a stupid mistake. But at least if you're on Azure, you've got a team far more expensive than many companies could afford there to fix things when they go wrong, and also the resources to put processes and systems in place to prevent idiotic cock-ups like this from occurring again.

Going Cloudy Part 3 - Configuring your endpoints


In my previous post, I outlined the new architecture. You may have noticed how we no longer have urls containing relative paths. This is because all of the components will be hosted on the same web role and I don’t want to have to deal with complicated startup scripts to configure IIS, therefore I want to try and use the Azure-provided methods as much as possible. When deploying multiple sites to a single web role in Azure, there are essentially two choices for setting it up.
  • Virtual directories
  • Sites
Using virtual directories involves nesting all of the secondary services in the hierarchy of the main web role project (in this case the website). The up side of this is that I could retain my current url structure. The main down side of this is the inheritance of web.config settings. I initially prototyped using this set up and spent a lot of time overriding entries from the main web role config in the web.configs of the web and rest services to prevent missing assembly and conflicting configuration errors.
An example of this is the following snippet from the assemblies section in the web.config of the main website.
          
          <add assembly="System.Web.Helpers, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
          <add assembly="System.Web.Mvc, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35, processorArchitecture=MSIL" />
          <add assembly="System.Web.Abstractions, Version=4.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
          <add assembly="System.Web.Routing, Version=4.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
          <add assembly="System.Web.WebPages, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
          <add assembly="System.Data.Linq, Version=4.0.0.0, Culture=neutral, PublicKeyToken=B77A5C561934E089" />

Looks harmless enough right? All of these assembly imports will get inherited down to the monitoring site’s child projects, the ASMX web service, and the Site and REST service. The latter 2 are also ASP.NET MVC so it’s not a problem. However, the asmx web service has none of the system.web assemblies in it, causing runtime errors post-deployment.
There were other items that caused me problems but this is the most obvious example. In the end, I decided the virtual directory set up was too brittle.
Using sites has the benefit of keeping all the individual applications on the server separate, so a failure or change in one won’t affect another. Because the applications are separate, we can’t use paths to distinguish which application we want when our request hits the server, so instead we use Host Headers.
The Host Header indicates the url that the user followed in order to get to us. By using subdomains for each service, we can tell from the host header which service the user is attempting to access.
The ServiceDefinition.csdef for the project looks like this.
  <WebRole name="domain.Monitoring" vmsize="ExtraSmall">
    <Sites>
      <Site name="Web">
        <Bindings>
          <Binding name="Endpoint1" endpointName="Endpoint3" />
        </Bindings>
      </Site>
      <Site name="Site" physicalDirectory="../../../../../AzPublish/Site-Release">
        <Bindings>
          <Binding name="Endpoint1" endpointName="Endpoint1" hostHeader="domain.com" />
          <Binding name="Endpoint2" endpointName="Endpoint2" hostHeader="domain.com" />
        </Bindings>
      </Site>
      <Site name="Gateway" physicalDirectory="../../../../../AzPublish/webservice-Release">
        <Bindings>
          <Binding name="Endpoint1" endpointName="Endpoint1" hostHeader="webservice.domain.com" />
          <Binding name="Endpoint2" endpointName="Endpoint2" hostHeader="webservice.domain.com" />
        </Bindings>
      </Site>
      <Site name="RESTService" physicalDirectory="../../../../../AzPublish/Rest-Release">
        <Bindings>
          <Binding name="Endpoint1" endpointName="Endpoint1" hostHeader="rest.domain.com" />
          <Binding name="Endpoint2" endpointName="Endpoint2" hostHeader="rest.domain.com" />
        </Bindings>
      </Site>
    </Sites>
    <Endpoints>
      <InputEndpoint name="Endpoint1" protocol="http" port="80" />
      <InputEndpoint name="Endpoint2" protocol="https" port="443" certificate="DomainCertificate" />
      <InputEndpoint name="Endpoint3" protocol="http" port="22202" />
    </Endpoints>


The Binding elements within each site define the endpoints that this site can be reached on and the hostHeader attribute maps this application to the url the user followed to get to us.
Notice how the domain.Monitoring project has no domain header. This means that this project will reply on endpoint3 (mapped to port 22202) for all traffic. Having it on a separate port means there is no chance of it ever grabbing traffic from the main projects.

You’ll notice that the Monitoring project is the main project in this deployment (I’ll go in to the monitoring project when I cover the traffic manager). The reason for this is that I had some initial trouble getting the host header set up to work with the Azure Traffic Manager and thought doing this may work. That later turned out to be incorrect but I see no harm in keeping the monitoring project as the main one here. Feel free to let me know if there are some caveats to doing this that I am not aware of.

Only the main project in the web role is compiled when publishing, so the ServiceDefinition needs to be pointed to a location containing the compiled project. In order to achieve this, I use file system deployment to publish the subprojects to the AzPublish folder. It’s not ideal and I intend to get all this scripted in Powershell before we go live. When I do that, I will of course publish the scripts on this blog. Also, if there is a better way, please use the comments.

Tuesday 19 February 2013

Going Cloudy Part 2 - Out with the old, in with the new


In my previous post, I briefly went over the current situation and the rationale for making the move to Microsoft’s Azure service.
I also promised details on the current architecture and what I have determined to be the new architecture. I will also attempt to explain how I got from old to new and why I have made the decisions I have made.

The old architecture. 

Here’s a quick diagram:
As stated in my previous post, this server is a single VM running 2GB of RAM and a couple of Ghz CPU. With those resources, for the live environment we are running:
  •  ASP.Net MVC Web site.
  • ASMX Web Service.
  • ASP.Net MVC-based REST service.
  • SQL Server database.

This is duplicated in the staging site which is located on the same server. On top of this, there is also SQL Server itself, an FTP Server running through IIS, an in-house import/export agent which processes all incoming and outgoing data.
There are also a number of support tools, there is Linqpad, SSMS,Notepad++, off-site file backup, and a number of other minor services.
What this all means is that, whenever I RDP on to change anything, this creates a notable strain on the system.
Bear in mind that the current service doesn’t have many users, say a dozen or so. This will increase by 10x this year so, clearly, the current setup just will not do.

Key requirements
The new infrastructure has a number of key requirements. Other than the obvious speed, reliability,etc, there is also the following.
Speed must be the same regardless of where the user is.
This means website and web service hosted in both EU and US. This also means two databases which need to be kept in sync.
Needs to respond to rises and falls in demand.
This is what cloud does best. We can add and remove instances easily in a matter of minutes. There is also the possibility of using the Scaling Application Block to drop instances down to having a single server when it is night time in their region, halving the service costs during those times.

The new infrastructure

The new infrastructure is distributed, fairly fault tolerant and should be easy enough to scale - all those cloudy terms that get overused. Over the next few posts, I will go over my reasoning for laying it out this way and problems I have encountered while creating the infrastructure. I hope to do that in approximately this order.
  • Changes to URL scheme.
  • FTP server and REST service configuration.
  • Instance configuration, the Monitoring project and the Azure Traffic Manager.
  • SQL Azure and using Data Sync.

Tuesday 12 February 2013

Going Cloudy Part 1 - The Beginning

If you believe the hype, "Cloud" is the future and everything will end up there eventually.
It's easy to think that "Cloud" is just another term for deploying your applications and services on virtual servers hosted in someone else's data center, mainstream media make this mistake all the time.

Making full use of the cloud, in my opinion anyway, is architecting and designing your system to take advantage of the availability and scalability that the cloud gives you.

Designing for the cloud is one thing, migrating an existing application to the cloud and taking advantage of all it has to offer, is another entirely. Recently, I have been tasked with planning and carrying out a migration to the cloud.

The application in question is currently hosted on a VM running with just 2Gb of RAM. This VM runs IIS, which hosts 3 individual web-based components of this application. It also hosts the database on SQL Server and a number of other additional tools and services. The user base is currently fairly small, limited to a dozen or so users located in the UK, China, USA, and South Africa.

The customer base is expected to expand massively this year, with the number of users from the US pushing over a hundred and a number of users coming on board from other countries around the world.

Clearly, the current server won't handle this, and Cloud is the way forward for this customer.

Amazon or Azure?

Amazon has a myriad of services which fill different niches and markets, but don't seem very joined up. In my opinion, while Amazon's cloud is much more fully featured and mature than Azure, it is a big bucket of disparate services trying to be everything to everyone.
Azure on the other hand, while being newer and not as feature complete as AWS, does what it does very well. All of Azure's services are aimed at allowing developers to build scalable and highly available services, the developer experience is first class and you can get started in minutes.
Microsoft finally recognizes that enticing developers is the key to success, and they have embraced other languages in Azure, creating SDKs for Node, PHP, Java, and Python. Azure is a great platform to develop for, and not just for .net developers.

So Azure it is.

Next time, we'll study the current architecture of the application and look at how it needs to change in order to make the move and meet the customer's requirements.

Disclaimer

I'm not a cloud expert so don't take what I am doing as best practice. This series is intended to help others learn from my experiences. If something can be done better, please use the comments to impart your wisdom.