Thursday, 9 October 2014

Copying records between tables in different Azure accounts : The Next Generation

A few months back I posted a small piece of code for copying objects between Azure Table Storage Accounts. I have 2 bug-bears with the code I previously posted.

  1.  It's all very custom, you need to give it the type that is in the table and add lambdas to make sure it doesn't try to select the wrong object. 
  2. Due to a change in the Azure Storage Library, it no longer works. 

 I hadn't used this script in a while but now I have an impending need for something like it but better that will allow me to copy the contents of all the tables or a defined subset of tables in a given account, enter the DynamicTableEntity which allows you to get an entry from a table as a dynamic object.
To run the code, open up Linqpad and add a reference to the Windows Azure Storage library, best way to do this is using Nuget.

void Main()
{
  var srcClient = CreateClient("source account connection string");
  var destClient = CreateClient("destination account connection string");
  
  var mappings = new List<Tuple<string,string>>();
  
  //Manually setup mappings.
  //mappings.Add(new Tuple<string,string>("table1","table1copy"));
  //mappings.Add(new Tuple<string,string>("table2","table2copy"));
  
  //Copy all tables from the src account in to identically named tables in the destination account.
  var tables = srcClient.ListTables(null, new TableRequestOptions(){PayloadFormat = TablePayloadFormat.JsonNoMetadata});
  mappings = tables.Select (t => new Tuple<string,string>(t.Name,t.Name)).ToList();

  Copy(srcClient,destClient,mappings);
}

public void Copy(CloudTableClient src, CloudTableClient dest, List<Tuple<string,string>> mappings) {

  mappings.ForEach(x=>{
    var st = src.GetTableReference(x.Item1);
    var dt = dest.GetTableReference(x.Item2);
    dt.CreateIfNotExists();
    
    var query = new TableQuery<DynamicTableEntity>();

    foreach (var entity in st.ExecuteQuery(query))
    {
      dt.Execute(TableOperation.InsertOrReplace(entity));
    }
  });
}

public CloudTableClient CreateClient(string connString){
  var account = CloudStorageAccount.Parse(connString);
  return account.CreateCloudTableClient();
}

In the above code, we create CloudTableClients to represent the source and destination accounts, then we build a mapping of source and destination tables.
We can do this manually if we only want to copy some tables and/or we want the destination tables to have different names to the source tables.
Alternatively, we can get a list of all the tables from the source and use that to build a 1:1 map, this will have the result of copying all items in all tables from the source account to the destination.
The Copy method simply takes the clients and mapping and does some iteration to get the items from each table from the source account and save them to the destination account.

Note: The code above is horribly inefficient for copying large amounts of data as it inserts each request individually. In a follow up post, I'll make this more efficient by making use of the TableBatchOperation.

Sunday, 31 August 2014

Using Custom Model Binders in ASP.Net MVC

I answered a question on Reddit this week from someone starting out in MVC who had read an incorrect article about model binding which was mostly correct, but made using custom binders look like they require more code than they actually do, so I thought it was worth a post to clear that up.

What is (Custom) Model Binding?

Model Binding is the process through which MVC takes a form post and maps all of the form values in to a custom object, allowing you to have a POST action method which takes in a ViewModel and have it automagically populated for you. Custom Model Binders allow you to insert your own binders for particular scenarios where the default binding won't quite cut it.

Creating our custom binder

We have the following typical example ViewModel:
    public class MyViewModel
    {
        public string MyStringProperty { get; set; }
    }
It's just a class, nothing special about it at all. Now we want to manually handle the binding of this model because we want to add some text to the end of MyStringProperty when it gets bound. This is unlikely to be something you would want to do in real life, but we're just proving the point here.
This is our binder:
    public class MyViewModelBinder:IModelBinder
    {
        protected System.Collections.Specialized.NameValueCollection Form { get; set; }

        private void Initialise(ControllerContext controllerContext, ModelBindingContext bindingContext)
        {
            Form = controllerContext.HttpContext.Request.Form;
        }

        public object BindModel(ControllerContext controllerContext, ModelBindingContext bindingContext)
        {
            Initialise(controllerContext, bindingContext);
            var msp = Form["MyStringProperty"];
            return new MyViewModel {MyStringProperty = msp + " from my custom binder"};
        }
    }
Model Binders need to implement IModelBinder and have a BindModel method. This gives you access to the controllerContext from which you can access HttpContext and the bindingContext, which admittedly I have never had to use.
In our binder, we just manually pick up the MyStringProperty value from the form, add it to a new instance of our object and return it, adding our incredibly important piece of text to the end of the retrieved value.

Using our Custom Binder

There are 2 ways we can use our custom binder, which one we use depends on each scenario. If we need to override the binding of a class for a particular Action method, we can use the ModelBinder attribute on the relevant parameter of the Action Method:
        [HttpPost]
        public ActionResult Index([ModelBinder(typeof(MyViewModelBinder))]MyViewModel model)
        {
            return View(model);
        }
This will apply our Custom Binder to this property (MyViewModel) for this action only, no other actions or controllers will be affected.
Alternatively, if we want to apply our custom binder to MyViewModel globally within the application, we can add the following line to Application_Start in global.asax.cs:
        protected void Application_Start()
        {
            AreaRegistration.RegisterAllAreas();
            FilterConfig.RegisterGlobalFilters(GlobalFilters.Filters);
            RouteConfig.RegisterRoutes(RouteTable.Routes);
            BundleConfig.RegisterBundles(BundleTable.Bundles);

            ModelBinders.Binders[typeof(MyViewModel)] = new MyViewModelBinder();
        }
Using this method, everywhere a parameter of type MyViewModel is encountered on an ActionResult, our custom binder will be invoked instead of the standard one. Because this applies globally, we do not need the ModelBinder attribute on our Action Method so the overridden behaviour is completely transparent to the controller, promoting code reuse and keeping model binding logic where it belongs.

Wednesday, 6 August 2014

API Head-to-head: AWS S3 Vs Windows Azure Table Storage

Recently, I was experimenting with using S3 as a tertiary backup for my photos, an honour which eventually went to Azure as it was cheaper and I am more familiar with the Azure APIs as I use them in my day job.

I thought I’d take a deeper look at both APIs and see how they compare. I’ll go through some standard operations, comparing the amount of code required to perform the operation.

If you want a comparison of features, there are plenty of blog posts on the subject, just Bingle It

All the code in this test is being run in Linqpad, using the AWS SDK for .Net and Windows Azure Storage Nuget packages.

Create the client

Both Azure and S3 have the concept of a client, this represents the service itself and is where you provide credentials for accessing the service.

Azure

var account = Microsoft.WindowsAzure.Storage.CloudStorageAccount.Parse("connectionstring");
var client = account.CreateCloudBlobClient();

S3

var client = AWSClientFactory.CreateAmazonS3Client("accessKey", "secret",RegionEndpoint.EUWest1);
S3 wins on lines of code but I don’t like having to declare the datacenter the account is in. In my opinion, the application shouldn’t be aware of this. 1 point to Azure.

Creating a container

This is a folder, Azure refers to is a container, S3 calls it a bucket.

Azure

var container = client.GetContainerReference("test-container");
container.CreateIfNotExists();

S3

try
{         
 client.PutBucket(new PutBucketRequest { BucketName = "my-testing-bucket-123456", UseClientRegion = true});
}
catch (AmazonS3Exception ex)
{
 if(ex.ErrorCode != "BucketAlreadyOwnedByYou") {
  throw;
 }
}

S3 loses big time on simplicity here. To my knowledge, this is the only way to do a blind create of a container, that is creating it without knowing up front if it already exists. Azure makes this trivial with CreateIfNotExists. 2 points to Azure.

Uploading a file

Azure

var container = client.GetContainerReference("test-container");
var blob = container.GetBlockBlobReference("testfile");
blob.UploadFromFile(@"M:\testfile1.txt",FileMode.OpenOrCreate);

S3

var putObjectRequest = new PutObjectRequest {BucketName = "my-testing-bucket-123456", FilePath = @"M:\testfile.txt", Key = "testfile", GenerateMD5Digest = true, Timeout=-1};
var upload = client.PutObject(putObjectRequest);
They’re pretty much equal here, but the S3 code is more verbose. I like the idea of getting a reference to a blob while not knowing if it actually exists or not.

List Blobs

Azure

var container = client.GetContainerReference("test-container");
var blobs = container.ListBlobs(null, true, BlobListingDetails.Metadata);
blobs.OfType().Select (cbb => cbb.Name).Dump();

S3

var listRequest = new ListObjectsRequest(){ BucketName = "my-testing-bucket-123456"};
client.ListObjects(listRequest).S3Objects.Select (so => so.Key).Dump();

In terms of complexity, they’re pretty even here too. Azure has one more line but it’s not a difficult one. Notice that whereas with Azure, we get a reference to a container and then perform operations against that reference, with AWS all requests are individual so you end up having to explicitly tell the client for every operation what the bucket name is. Point to Azure.

Deleting a Blob

Azure

var dblob = container.GetBlockBlobReference("testfile");
dblob.Delete();

S3

var delRequest = new DeleteObjectRequest(){ BucketName = "my-testing-bucket-123456", Key="testfile"};
client.DeleteObject(delRequest);
Neither code is particularly complicated here, but I prefer Azure’s simplicity with the container and blob reference model so point Azure.

Delete a Container

Azure

var container = client.GetContainerReference("test-container");
container.Delete();

S3

var delBucket = new DeleteBucketRequest(){ BucketName = "my-testing-bucket-123456"};
client.DeleteBucket(delBucket);
Again, pretty equal. To micro-analyse the lines, you could say that for Azure, you’ve got one potentially reusable line, and one throw-away line. With S3, they’re both throw away. But in reality, unless you’re doing thousands of consecutive operations, it doesn’t really matter.

Conclusion

In terms of complexity, Azure’s and S3’s APIs are pretty much equal, but it’s easy to see where they each have their uses. Azure’s API is a much thicker abstraction over REST, whereas the S3 API is such a thin-veneer that you could imagine a home-grown API not turning out that differently (but most likely not as reliable).

In my mind, if you’re doing lots of operations against lots of different blobs and containers then S3’s API is more suitable as each operation is self-contained and there are no references to containers or blobs hanging around.

If you’re doing operations which share common elements, such as performing numerous operations on a blob or working with lots of blobs within a few containers, Azure’s API seems better suited as you create the references and then reuse them, reducing the amount of repeated code.

Bonus Section

If you could be bothered to read past my conclusion, congratulations on your determination! The comparative speed of Azure and AWS has been done to death, but I couldn’t resist getting my own stats.

These are ridiculously simple stats, essentially Stopwatch calls wrapped around the code in this post. The file I am uploading is only 6k. The simple reason for this is that everyone tests how these services handle lots of large objects, but no one seems to cover the probably more common scenario of users uploading very small files. The average size is probably higher than 6kb, but this is what I’ve got hanging around so this is what I’m using.

So here are my extremely simple and probably not at all reliable benchmarks.

Operation S3 Azure
Create Container573279
Upload 6Kb file9955
List Blobs (1)41103
Delete Blob5545
Delete Container22138
All times are in milliseconds. I’ve got to admit; I was expecting a more even spread here. Azure is significantly faster creating and deleting containers and uploading the file. It is also faster at deleting a blob, but the difference is insignificant. S3 wins significantly listing blobs.

Not covered in this post: Both APIs also have the Begin/End style of async operations and Azure has the bonus of async operations based on the async/await pattern, I may do another post on that in the future.

TL;DR; Azure's API is in my opinion a better abstraction and it's faster for most operations.

Friday, 18 July 2014

Upgraded to Azure Storage Emulator 3.2, where have all my tables gone?

In an attempt to solve a 400 error accessing tables on the Azure Storage Emulator 3.0 today, I upgraded to 3.2 using the Web Platform Installer. This resulted in a kind of good news, bad news situation.

Good - The error stopped happening.
Bad - Where the f**k have all my tables gone!

I'll be buggered if I'm recreating and repopulating them all so I went hunting. I managed to find the emulator database is in C:\Users\<username>\. In that directory you'll find mdf files called WAStorageEmulatorDb**.mdf where ** is the version number. I had ones ending in 22, 30, 32. Each will be accompanied by a _log file.

I loaded them up in Linqpad and the schemas looked the same, so for a punt I just renamed the files ending in 32 to something else and the renamed the files ending in 30 to 32.

Start up the emulator and everything is present again. That saved me a few hours!

Wednesday, 19 March 2014

Copying records between tables in different Azure accounts

Today I had to quickly throw up a new instance of a customer's service in Hong Kong as they've got a big demo event coming up and want things to be as quick as possible. Now I haven't quite got things to a point where I can have multiple geographically distributed instances of the service all happily talking to each other and sharing data so this instance is it's own little island, a completely separate instance to the main one in the EU.

Deploying the new Cloud Service was easy.
Taking a backup of the EU database and deploying it to Hong Kong was also easy.

However, recently I've been making increasing use of Azure Table storage for trivial data storage scenarios where the data isn't relational and the data will want to be shared amongst multiple instances eventually without having to wait for a database sync. It was at this point that I realised, I have no way of copying data from one storage account to another.
Time to correct that!
public void Transfer<T>(Microsoft.WindowsAzure.Storage.CloudStorageAccount fromAcc, Microsoft.WindowsAzure.Storage.CloudStorageAccount toAcc, string table, Expression<Func<T,bool>> expr) where T: TableServiceEntity {
  
  var fromTC = fromAcc.CreateCloudTableClient();
  var fromT = fromTC.GetTableReference(table);
  
  var toTC = toAcc.CreateCloudTableClient();
  var toT = toTC.GetTableReference(table);
  toT.CreateIfNotExists();
  
  var fromContext = fromTC.GetTableServiceContext();
  var toContext = toTC.GetTableServiceContext();
    
  var fromData = fromContext.CreateQuery<T>(table).Where(expr);

  foreach (var item in fromData)
  {
    toContext.AttachTo(table,item);
    toContext.UpdateObject(item);
  }
  toContext.SaveChangesWithRetries(SaveChangesOptions.ReplaceOnUpdate);
}


This Transfer method takes in a from account and a to account and the name of the table.
The last parameter is an expression for the where clause. This is for scenarios where the same table contains multiple types of objects and you just want to query out the ones of a particular type for transfer using whatever clause is appropriate.

T must derive from TableServiceEntity and be the type of the object from which the record originated, or one that is similarly shaped.

The method is quite straight forward, it just fires up 2 table clients, gets a reference to the table specified by the table parameter, creates it on the receiving end if it doesn't exist (I think it's safe to assume that it already exists at the source end), queries out the data, then attaches it to the destination context and saves changes.
This upserts all of the data in to the source table.

Usage is simple:
var fromAccount = Microsoft.WindowsAzure.Storage.CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=accountname;AccountKey=accesskey");
var toAccount = Microsoft.WindowsAzure.Storage.CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=accountname;AccountKey=accesskey");

Transfer<MyProject.MyType1>(fromAccount, toAccount, "sharedtable", p=>p.PartitionKey == "Type1");
Transfer<MyProject.MyType2>(fromAccount, toAccount, "sharedtable", p=>p.PartitionKey == "Type2");
Transfer<MyProject.SomeOtherType>(fromAccount, toAccount, "someothertype", p=>p.PartitionKey != "");
Put all this together in Linqpad and you've got a simple way to transfer records between accounts on an ad hoc basis. As expected, it works with the Storage Emulator so you can use it to clone the contents of a production account down to your local dev machine and vice versa.

Saturday, 21 December 2013

JSON vs XML: Challenging my assumptions

I was recently (today actually) working on optimising a particular section of my project. This section is basically a Q & A that uses a piece of server-generated XML which is placed in to a Razor view where some Javascript works with it to generate the input fields for the user.

But XML is so 2010 right? If I swapped the XML for some JSON the payload would be smaller, it would generate faster, the javascript would be able to work with it faster, right?

Let's test those assumptions one by one shall we?

To do that, I duplicated the method that populates an object and serializes it to XML and changed the serialization to use JSON instead and ran them both at the same time so I could compare them side by side.

Payload size

I added a call to Encoding.ASCII.GetByteCount() around the serialized JSON and XML results and dumped this to the Trace windows. The results are below

Type 1 2 3 4 5 6 7
XML 3338 4133 3487 3255 3465 3194 1138
Json 2266 2717 2292 2110 2234 2061 621

This is in bytes, so the real world difference between the XML and the JSON in this case is at most a single kilobyte. This isn't always the case as I've swapped out XML for JSON in the past where it has been many times smaller.

Generating the output

Next job is to test how quickly the output is generated. Keep in mind that the method I'm testing includes other data access logic so these timings are not purely serialization. But as both will be doing the same work, it won't hurt to have that included as well to see how each performs in the real world.

My initial results looked rather promising, see table below:
Type 1 2 3 4 5 6 7 Avg
XML 87.4 31.9 33.6 29.6 31.6 29.6 45.9 41.37143
Json 38.9 20.7 27.8 21.2 23.8 24.4 23.5 25.75714

Across the whole Q & A session, JSON comes out not far off 50% faster. So I should definitely swap it right?

But wait, the method that we're capturing also includes some data access. The JSON method is running second, could it be benefiting from the XML method's work?
To test this out, I swapped them round so the JSON method was first. Below is the aggregated table from both runs.
Type 1 2 3 4 5 6 7 Avg
XML First 87.4 31.9 33.6 29.6 31.6 29.6 45.9 41.37143
XML Second 39.2 25.3 26.7 24.3 26.5 37.1 21.6 28.67143
JSON First 83.5 28.9 30.4 30.9 32.7 46.3 47.7 42.91429
JSON Second 38.9 20.7 27.8 21.2 23.8 24.4 23.5 25.75714
As you can see from the average, I was right to be suspicious. The first method to run is always slower than the second due to things like EF caching records. In reality, the difference between the 2 is so small that no one is ever going to notice.

These two simple benchmarks which took me about 10 minutes to complete shows that to substitute the current XML payload for JSON would be a micro-optimisation. I've got much bigger fish to fry than this so it is not worth my time to do this work.
 Will swapping out make things faster? Absolutely. Will anyone notice? Not remotely.

You'll notice I didn't perform the third test of measuring how the Javascript handled JSON as opposed to XML. I'm sure that would probably show a modest improvement as well but based on these numbers, my best course of action is to not waste any more time on this path.

Tuesday, 3 December 2013

Windows 8.1 - The good, the bad, and the missing.

Little over a year after Windows 8, it gets an upgrade with the release of 8.1. Is this just a Service Pack by any other name or does it provide real improvement over it's predecessor that makes it worthy being a numbered release?

 First, the good.

 Real effort has been made in 8.1 to reduce the jarring effect of moving between Windows' Classic and Modern UIs. Here are the main improvements:

  • Search has been altered so that when you search, the search window and initial results appear as a flyout on the right side of the screen rather than taking over the entire screen.
  • The desktop background is now used as the background of the start screen. I though this sounded a lot like a gimmick when I first heard about it, but it genuinely does reduce the cognitive dissonance experienced when moving between the two UIs.
  • The Start button is back. I don't personally care as I've gotten used to it not being there, but hopefully it's return will go some way towards appeasing the masses. 
  • Better help on install. When you first installed Windows 8, it was very much a "go and figure it out" experience. Beyond the brief tutorial that appeared when starting for the first time and which I'm fairly sure no one ever paid any attention to, you were on your own when it came to learning the new UI. Windows 8.1 adds in some nice big helper blocks that point out things like the hot corners and how you can use drag/swipe to perform certain actions. If you've been using Windows 8 already, these will probably be a bit annoying as it's just telling you things you already now, but to new users they will go a long way in improving the transition to Windows 8 Modern UI.
  • More freedom when using multiple applications. You can now pick the amount of space each app takes up on your screen and have more than 2 of them. This is really such a basic feature that it should have been there from the beginning, better late than never I suppose.


I haven't played with it much yet, but I really like the new aggregated search view which brings up results from lots of difference sources all in a single very usable view.
 The Windows store was beyond poor in Windows 8. It was fine if you were looking for something specific but it was absolutely useless for discovering apps. This has been remedied in Windows 8.1, I'm hoping this will allow existing developers to make more money out of their apps and encourage more developers to get writing Windows 8 apps.

 There are a load of other features added, such as 3D printing support which isn't that big a deal currently but I think Microsoft have made a sly move in baking in OS level support for what is a growing technology.

The bad
The start button is back. Wait, didn't I just do this one? Yes I did, but in a prime example of "you can't please everyone all of the time", in the last year of using Windows 8 I've gotten used to having that extra slot on the taskbar and haven't remotely missed the venerable start button that used to occupy that number one slot. I completely agree with Microsoft's move of bringing it back, but they could've at least made it an option to not have it.

That's pretty much it for the bad, there really isn't that much to moan about in this release. It genuinely seems like Microsoft have taken the time to listen to users and fix the problems that have really plagued them. Some would say they should have listened to users during the Beta and Preview periods, and they're probably right. Hopefully, Microsoft have been a little humbled by their grand UI plans not being embraced as they had hoped. The concessions made in this release certainly seem to suggest that is the case.

The missing
WEI is gone! I'd really love to know the argument behind getting rid of the Windows Experience Index. It was a great tool that allowed your average user to easily see where the bottleneck was in their system without having to understand the various benchmarks or install software on their computer to perform tests against those benchmarks. I genuinely don't understand why this has been removed, perhaps Microsoft will come out with an explanation in the coming months.

Experience on my Dell Mini 9
I have a little Dell Mini 9 which I decided to use a tester for how Windows 8 ran on low powered hardware. For reference, it has a 1.6Ghz Atom, 1Gb of RAM, and a 14Gb hard disk. Windows 8 ran okay on this at first but suffered the inevitable slow down as time wore on to the point where it was pretty much unusable for anything serious. IE10 was not even worth starting. Also, I only had a couple of Gb change on the 14Gb hard disk.
 After doing a fresh install of 8.1, I had 5.2Gb free space on the hard disk, which astonished me and this little device suddenly seems a lot nippier than it was with a fresh Windows 8 install. IE11 is a lot faster than it's predecessor and I now use the Mini for email and Campfire on a daily basis.
 Time will tell if it keeps this speed boost or if it falls away with continued use, but you can see that some effort has gone in to improving performance for this release.


 So is it worthy of being a .1 release? Yes, I think it is. There are lots of small improvements here, if it only had half of them it would probably be SP1 in my eyes, but there are enough improvements and extra features here that make this more than a Service Pack.