Run! The Distributed Systems are coming!

You can avoid the fate of those that fell before you

This is a letter to my fellow engineers, specifically those who operate in the front-end world. This letter is sent with love, but carries a message of deep foreboding. It is a warning and yet it is a message of hope, that there is a chance to maintain peace and happiness in your work life.

This letter is inspired by a question recently posed by the talented Andy Pimlett – with whom I had the pleasure of serving alongside the lovely people of Mando. See Andy Pimlett on LinkedIn: JavaScript Front-End Architecture. Therein Andy poses a question that has been keeping my mind in constant churn: ‘what does the line between the UI and the backend platform look like now?’. More importantly to me however is raising awareness of the quite frankly ridiculous levels of complexity that hide behind the innocent term ‘Microservices’, as I shall lay out here…

Lurking in your Jam

The devil doesn’t come dressed in a red cape and pointy horns. He comes as everything you’ve ever wished for.

Tucker Max – Apparently

Jamstack is exactly the sort of revolution the front-end world has desperately needed for too long. It promises to reduce the long-standing pain resulting from complexity and bloat that has emerged out of the constant over-engineering introduced by well intending technologies such as npm, React and Angular.

I’ll not cover the ‘Client’ end definition of Jamstack as there are plenty of perfectly good resources for that already including the Jamstack website itself: https://jamstack.org/. To be absolutely clear on my position here, I have been blown away by the power of Jamstack. This really is the tech that the web needs now. You want to put your money into new web technology? I’m telling you it is Jamstack and its global community; Web3 can get in the bin (really, forget Web3, it’s another last-ditch Ponzi-scheme from the crypto bros to try and keep their awful blockchain investments on life-support for another year).

That said, the threat to your peace and happiness that I wish to warn you about lies, unbeknownst to many, innocently nestled away in a corner of a diagram featured in many of the Jamstack articles, videos and websites that talk about this otherwise wonderful new technology:

Buried in these tempting looking simple diagrams lies a terrible, terrible beast that wears a red coat and charges through your peaceful villages shooting anyone who fails to kneel before the might of the King.

(Let’s forget for a moment that all the diagram shows is that the app server on down has been moved into a small little innocent tidy-looking box alongside CDN)

A Microservice by any other name will be just as difficult

Scenario: You’re building an application… You are writing some logic that calls a method on a class in another area of the application.

Question: Would calling the method be much improved by sending it on a trip along a network?

The only sensible answer, surely, would be: “are you mad?”. And, yet, here we are, some 10 years on since the term ‘Microservice’ was first bandied about, with half the planet decomposing their applications into separate ‘Microservices ‘ and having them directly call each other synchronously across a network. This antipattern now even has its own name: ‘distributed monolith’ and it is known to give all the pain of a monolith plus the pain of a distributed system and basically none of the benefits.

Why would anyone willingly triple the complexity of an application for no tangible benefit? Well… first of all… I guess… we’re social creatures, if our peers and those we look up to are doing something then we naturally take an interest. If everyone in your industry is talking about a thing then it becomes so well known that you may even be getting pressure from your boss or clients to do that thing.

Secondly… we’re techies, nerds, geeks, we like the challenging things. Hey, if Netflix do it and it gives them great results then the complexity must be worth it and it will feel great to wield complexity and win fortune for the business!

Finally, I think the name has a lot to do with it. How much harm can come from something called a ‘microservice’? Well, would you willingly embark on the path of a Service Oriented Architecture for your UI tier? Because that is what you are doing  whether you realise it or not. Microservices are a slant on the classic ‘Distributed System Architecture’. It has all, if not more, of the complexity of your classic Distributed System, but the name ‘microservice’ doesn’t really give you that feeling. Sure a Microservice must be an easy service, easy to deploy, easy to build, easy to integrate? No. No. No. Read on…

Distributed Systems destroyed my technical career

If ever there was a clear way to demonstrate how complicated Distributed Systems really are, it lies in my career trajectory. During the first half of my career I always imagined I was destined to become a technical architect. That is until I encountered technical architecture in Distributed Systems design. Now I am a hands-off manager. I find that managing teams of people and managing projects with all of the pressures and complexities that come with those responsibilities is far easier and more rewarding than programming Distributed Systems. But of course some people love this stuff! I’m not saying it is bad or a terrible choice, just that I do not have the technical chops to survive a career in this area and I don’t consider myself to be very dim. I certainly wouldn’t want to wander into this quagmire by accident!

So, what, exactly, are you trying to protect me from?

I get it, by now you’re probably sick of hearing me describing how difficult I find Distributed Systems. So let me tell what I know that engineers must know to be able to deliver effective Distributed Systems, or Microservices, or SOA, or Cloud Native Architecture, or MACH or whatever you want to call it – they all have the same fundamental concerns.

  1. Loosely coupled architecture – You need it. Its numero uno for any distributed system. Without it you have a distributed monolith which isn’t a distributed system at all. Loosely coupled means embracing asynchrony and with that comes eventual consistency, which your application must also embrace. Asynchrony is different from ‘async’ calls. Instead your tools in this space include message queues, service buses and brokers – more specifically for cloud native you will include events and streams.
  2. SRE – Distributed Systems are so complex that the industry consensus is that you should assume your system is in a failure state and go from there. I’m not even kidding. See Software Reliability Engineering, you will likely need someone skilled in this on your engineering and operational teams. You need to make sure your system is consistent, available and durable. This really is a career path in its own right!
  3. Deployment Strategies – You need to be able to safely update your microservices, components and so on without disrupting service. See the Google DORA research in this area for the different strategies you will need to familiarise yourself with: https://cloud.google.com/architecture/application-deployment-and-testing-strategies
  4. Observability: Modern cloud native systems will often have hundreds of components interacting with each other to achieve common business goals. How do you know if the system is healthy or not? How do you know the lifecycle of your customer’s data through your system for compliance with data protection standards? Enter Application Performance Monitoring and Distributed Tracing. See: https://opentracing.io/docs/overview/what-is-tracing/

Hey, I would always default to a single application until such time that the technical demands necessitate decoupling. Monoliths are NOT a dirty word, they have many benefits in terms of maintainability. A helpful rule here is thus: if your components need to call each other directly to achieve their goals then keep them together in a monolith and save yourself a world of pain. Most UIs that call down into a distributed architecture will probably be fine with a single UI aggregation tier or at most BFFs (Sam Newman) https://samnewman.io/patterns/architectural/bff/). Introducing loosely coupled architectures into your UI strikes me as a one-way trip to pain.

Your options have been laid before you…

Know that it is not a technical decision, it is a business decision and often closely tied to your business architecture.

It is too late for many of us in backend/platform engineering but I am worried for those of you responsible for client tier architecture. The concept of ‘Microservices’ is a creeping menace and I hope that these lessons serve to protect your sanity and stress levels.

So don’t feel pressured into a Distributed Architecture just because it’s a name you’ve heard (Microservices, MACH) or part of a wider architecture that you already get some benefits from (i.e. Jamstack). Just be aware that those little boxes on those diagrams belie decades of computer science that will likely never become as simple as the tempting diagrams want you to believe.

Understand whether you truly need microservices and avoid them if at all possible! But! But just because I struggle to imagine many scenarios where a UI tier would warrant a distributed architecture does not mean you don’t need it. Netflix and Amazon need it but so do some smaller businesses. Hell, my current employer has carefully considered its technical strategy, weighed the pros and cons and has decided that the increase in complexity of a distributed system is worth it for their commercial strategy. Know that it is not a technical decision, it is a business decision and often closely tied to your business architecture.

If you wish to stand and fight then be prepared to wield your weapons of Loosely Coupled Architecture, SRE, Deployment Strategy and Observability. But be sure, no one will blame you should you choose to consolidate your efforts in something far simpler; something that protects your daily life, bringing joy to you and your customers.

Run! The Distributed Systems are coming!

Improvement Sprints

or Shuhari Sprints

Nothing worth doing is easy and anything worth doing is worth doing well 

… or so the sayings go. My personal experience certainly proves to me that these are more than simple soundbites and I am fairly confident in those sayings having more than a ring of truth to most IT professionals. So then why do we so often see innovation and professional development limited to the occasional day or Friday afternoon? Well read on…

What’s in a name? 

Let’s start with some background to the subject. Dan Pink recently popularised in his ground-breaking book ‘Drive’ (Amazon Link) that there are three key pillars to employee motivation, namely autonomy, mastery and purpose. This follows on closely from the RAMP model defined in Self-determination Theory (Deci and Ryan 1985). If you’ve not read either book then you should! I assume it is from here that we get the name I have been using to describe this subject ‘Mastery Sprints’. For what its worth, if I recall correctly I first heard this term when speaking to some of the engineers from ING (of the famous Agile transformation story: https://www.mckinsey.com/industries/financial-services/our-insights/ings-agile-transformation). 

I have been considering the implications of the term ‘mastery’ having been using ‘Mastery Sprint’ for some time now as I have grown increasingly aware of its connotations with white supremacy and slavery. For some reason I had been incorrectly thinking that mastery was of a different root and meaning than ‘master’ – famously now excluded from most areas of technology terminology. I will be avoiding using the term mastery from now on as its roots are beside the point, the term has severely negative connotations and meaning and that is enough. I’m instead opting for ‘Improvement Sprints’ as that does just as effective a job of revealing the intention of the activity. 

I would love to hear of your thoughts on the naming of this. It’s a conversation worth having! 

Update 1st February, 2021

I have already had some feedback on the naming with the term Shuhari Sprints suggested by one of my GBG colleagues:

One of the issues I faced when dropping ‘Mastery’ from my vocabulary is that there didn’t seem to be an equivalent term in English for the specific use case. Perhaps unsurprisingly there is a perfect term in Japanese and I will be proposing the use of Shuhari Sprints from here on. I dare say that this term is even more suitable than ‘Mastery’ for describing what we are trying to achieve and seems to have many further related uses. Google Shuhari, I’m sure you’ll agree!

Nothing worth doing is easy 

When I first heard the idea my initial thought was along the lines of ‘that’s nice and all but I can’t see it being that much benefit to justify the disruption’. Then I started to hear more about the idea from people I trust and my curiosity grew. From there I found myself frequently identifying common organisational issues that would be addressed by Improvement Sprints, so I started making a list from conversations discussing related issues:

“We allow engineers 1/2 a day every week of training time, and this can be rolled up if needs be but they never seem to take it” 

“There’s always too much planned or reactive work to take training time” 

“I’ve tried using training time but it always takes too much effort to switch my environment and my focus then someone calls me and the whole thing goes out the window” 

“I’ve done this research but then I don’t have the time to present it” 

“I’ve done this research and the results are incredible but now there’s no time to implement it in our main codebase” 

“Damnit this part of the codebase is always such a pain to work with we really need to find time to rewrite this part of it using this new tool that wasn’t available!” 

“These innovation days are great but we always need so much longer to really make a dent in this problem” 

Ask any developer of their experience in trying to be productive or effective in fitting training and innovation into the odd day or Friday afternoon and you’ll hear the same story of struggling to utilise the time effectively.  

Dedicated Improvement Sprints present an antidote to these problems. By taking an entire week or two and spreading the overheads across a number of people – ideally bringing independent squads and teams together for collaboration – you completely shift the overhead-to-productivity ratio as demonstrated in this very sciencey diagram: 

Diagram showing ratio of overheads in an activity

If an organisation allows or even encourages the use of ‘company time’ and budgets for training, innovation and improvements then it stands to reason that they would wish to see the most effective use of that time. 

Benefits 

The benefits beyond the drastic increase in efficient use of training/improvements time are many. 

Engagement 

I fully subscribe to the idea that the most effective teams are made of individual’s who’s personal and professional objectives align closely with those of the business at a given time. It’s classic synergy. And if Gee Paw Hill is on the same side then its fact in my opinion: 

This is reinforced by Dan Pink’s work in Drive, which makes clear how much more effective and engaged people are when they are progressing meaningfully in their careers; businesses that can leverage this by providing real opportunities for ‘mastery’ (sic, Dan Pink) will benefit from more engaged and productive team members.  

Improvements 

As the name suggests Improvement Sprints are how you make things better. You will be aware of the many issues your development teams complain about on a regular basis and this is a chance to address those using the latest and greatest. 

Innovation 

Speaking to INGs engineers indicated that as many as 1 in 5 developer suggestions make it into production – This isn’t just playing around, well it is really, but it is proven to result in actual usable value for your business.  

A scientifically proven competitive advantage 

The DORA DevOps research program (https://www.devops-research.com/research.html) has highlighted a number of key cultural behaviours of elite performing technology organisations that are clear differentiators for their competitive advantage. These include encouraging learning, experimentation, team collaboration and job satisfaction, all of these can be improved upon by leveraging dedicated improvements time. 

Step 1: Plan, Step 2: Collaborate, Step 3: Profit! 

In terms of how you schedule the Improvement Sprints I would suggest starting by taking the time already agreed for training and rolling it all up. So if you have say ½ a day each week per engineer then roll this up into a week giving you 1 week in 10, or 1 week after every 5 two-week sprints, you get the idea, use the maths that works for you. You’ll be starting with a NET benefit. Then the only thing you need to sell to the business is the planned disruption as the costs should level out and as discussed the benefits make it a solid sell.  

You have two options for organising your teams here. If you are fortunate to have no reactive or operational demands on your engineers then just book the time in and tell the business you are on a training break and unreachable. If you must maintain operational support availability then look at splitting your team in half and run the Improvement Sprints in sequence, one half at a time. Just make sure to mix up the split next time around. 

Then plan the work. Team members should be allowed to use this time how they see fit (autonomy) but one way of helping guide their efforts would be to have a brainstorming session of current burning issues or popular ideas and take a vote on each for being included in the Improvements Sprint. 

Measure and report the outcomes. It may be that some work will end up being fed into further production sprints but for the most part getting a team together to focus on a common goal should yield plenty of great outputs so be sure to shout about them! 

Some final notes

Overrun still happens 

Sure you can do a lot more in a week or two with a whole team than a few people in one day but sometimes things worth doing are even harder than expected 

Technical Debt need not apply 

Technical debt is a project decision and every time it is taken on it must be accompanied by a paydown plan and included in formal project planning. This is nothing to do with innovation or improvement and should be kept out of these discussions. 

Thank you and good luck!

I am poised to embark on this journey myself and the above captures where I am up to with my thinking and planning. I’ve found scant details on this on the Internet so would love to hear of your stories and experiences. I will report back here as my story unfolds.

Improvement Sprints

Concepts of Compliant Data Encryption

Introduction

This is a somewhat lengthy article that is intended to help anyone who is taking their first steps into learning about encrypting sensitive data in a compliant environment such as meeting PCI DSS requirements. The hope is that this is an effective stepping stone into the dry, dry world of encryption standards and compliance.

Background

As part of some recent work on a proposal for a PCI DSS compliant solution I found myself having to become intimately acquainted with the concepts and standards for protecting data. My initial foray into this world was met with what felt like an impenetrable wall of esoteric information. I had a few terms to get me started on my research. I knew that the solution has been designed to integrate with a ‘Key Management System’ and to use tiered encryption keys known as a ‘Data Encryption Key’ – for encrypting data – and a ‘Key Encryption Key’ for encrypting the Data Encryption Key – and that these are used in symmetric ciphers such as AES256. Now I’m usually pretty good at research (ahem! Googling *cough*) but I struggled to find any clear, easily digestible information on how these concepts all hung together. Wikipedia wasn’t much help and there was a lot of ambiguity between the various articles provided by vendors that only served to hinder a fledgling student of this subject. At any rate, I ploughed on and after a few late nights of reading through extensive, lengthy, dry product briefs and standards documents I managed to wrap my head around the problem space. This whole experience drove me to promise myself that I would record this knowledge in a simple form for posterity. So here we go…

What Problem Domain?

Alright, so where to begin?! Let’s start with the basics… The first problem I encountered here was in trying to understand what *this* is even called! Surely once I knew what the problem domain is commonly called then research would be so much easier. If only! Starting with the good ol’ Wikipedia material on ‘Key Management‘ didn’t turn up anything particularly useful. I know we were looking at using an external Key Management Service (KMS) such as AWS KMS so looking at the documentation there I found this problem space referred to as ‘Envelope Encryption‘. Interestingly this terminology is also used by Google GCM. Oddly however, more ‘classic’ non-vendor sources such as Wikipedia don’t have any reference to this as established terminology; is ‘Envelope Encryption’ a vendor-specific term? It wouldn’t surprise me if it was, especially given the confusion it raises with PKCS envelopes in the PKI space. Searching for Envelope Encryption does however turn up a Wikipedia article on ‘Key Encapsulation‘, which refers us back to the concepts of asymmetric PKI – GAH! 😩. Even worse than that, some OWASP info I found on the subject referred to this as ‘Tiered Encryption’. Makes sense but nowhere else seems to use that term. Finally, further digging in Wikipedia turned up ‘Key Wrap‘ as a concept that seems to describe the problem quite well, even referring to the NIST standard 800-38f – AES Key Wrap Mode covering ‘Key Wrapping’ and the use of ‘Key Encryption Keys’. Turns out this also aligns with PCI, ISO and IETF. Phew!

So, we’re dealing with Key Wrapping. Good, let’s go.

Gimme the freaking concepts already!

Symmetric Encryption

I’ll set the scene with the most fundamental tool we need to use: symmetric encryption. Protecting data at rest is typically achieved using ‘symmetric encryption‘, i.e. one single secret key for encryption and the same key for decryption. It is more than likely that we’re talking about the NIST approved AES (Rijndael) block cipher to perform the cryptographic operations on our sensitive data. For my fellow Microsoft stack developers you’ll probably be using one of the following APIs:

Native

  • CryptoAPI – Also known as CAPI, now obsolete in favour of CNG:
  • CryptoAPI Next Gen – Also known as CNG, available since Windows Vista. The AES API here is accessed via BCryptEncrypt with the AES-GCM flag set

 Managed/.NET

I hope to cover off the differences in the Microsoft Cryptographic APIs in a future post. For now if you are not sure what to use then read up on the various sources above but you’ll probably want to just stick with CNG in your preferred programming model and you should be fine.

Key Management

Using most crypto APIs is a fairly well documented and relatively simple process so we’ll assume you’re not doing anything too crazy and get straight onto key management.

The saying goes that encryption is easy and key management is very, very hard. As I’m sure you are aware, if we only have one secret key for encrypting and decrypting our data then we’d better make jolly well certain that we’re handling that key carefully.

The Wrap

The problem at the root of Key Wrapping is how an information system should store its’ sensitive data at rest (i.e. on disk; in a filesystem or in a database, etc.) while ensuring Confidentiality, Integrity and Availability (CIA triad). So this is different from other common problem domains of encryption such as transmission and identity (PKI, PKCS, signing, etc.) and as such different concepts apply here.

The ‘wrapping’ part refers to the fact that we want to use two types of keys to protect our data. Specifically when talking about symmetric data encryption we’ll want a data encryption key to protect the data and we’ll also want a key encryption key to protect the data encryption key. For this document I’ll use the terminology DEK (data encryption key) and KEK (key encryption key) as per the terminology accepted by NIST.

KEK, MEK, DEK? What the feck?

It’s worth treading carefully in this space and ensuring that wires are not getting crossed when talking about the different keys. For instance, Microsoft frequently uses the DEK terminology to refer to the data encryption key but at the same time using the term Master Key in its DPAPI and SQL TDE models to refer to the KEK with AWS KMS using the term Customer Master Key for the KEK. Where this gets confusing is going back to standards such as NIST that use the term Master Key for something quite different and so it is worth always being aware of your frame of reference when researching in this space. Notably Google’s GCM KMS also uses the NIST style DEK/KEK terminology.

Why bother wrapping?

We need a DEK to encrypt our data, that is inescapable. Furthermore application design best practices dictate that it is worth keeping the DEK close to our data so that a) we can encrypt and decrypt our data without sending the sensitive data outside of our sovereignty (ideally without sending it beyond our application scope), and b) so that we can encrypt and decrypt our data without external dependencies and without the cost of network overheads (resilience, performance). But if we simply keep the unprotected DEK next to the data it protects then anyone who gets the data will be able to decrypt it.

This is where key wrapping comes in. By encrypting the DEK at rest we can keep the DEK close to its subjects and keep it secure and so we use a KEK to protect the DEK. To ensure then that we don’t have the same issue with an unprotected KEK we turn to a tamper proof and standards compliant key management tool such as a Hardware Security Module or a Key Management Service such as AWS KMS.

Your application should never see the KEK and so all of that key management and all of the complexity that comes with it is outsourced to standards compliant (PCI, FIPS, ISO) suppliers. Instead, our application requests a DEK from the KMS or HSM, which returns the DEK in both encrypted and unencrypted form. We store the encrypted form and use the unencrypted form in a transient process (I’ll cover in-memory DEK protection in a future post), disposing of it when we’re done encrypting. We then call the KMS to decrypt the DEK again at a later time when we need to decrypt the data. In short, key wrapping enables us to decouple key management responsibilities from our application’s data encryption requirements.

For further reading on this I’ll point you to the documentation for AWS KMS as this explains the concepts perfectly clearly. And don’t forget, AWS uses the term Customer Master Key – or CMK – to refer to the KEK!

Key Rotation

The final concept that your solution will need to consider is key rotation. ‘Key Rotation’ refers to the process of continually changing your encryption keys. This is a process that should be factored into the design of your solution and for the most part this should be completely automated and securely out of reach of human eyes. There should however also be provisions for manual intervention in response to security incidents.

Cryptographic Periods

Before we complete the discussion on key rotation we must first cover the inescapably esoteric concept of Cryptographic Periods (or Cryptoperiods). A cryptoperiod is the amount of time that an encryption key should ‘live’. It is not enough to have an encryption key and keep it safe. A key won’t last forever. At some point it will become too weak or compromised to serve its purpose. This could be due to anything from the risk of someone discovering the key to the fact that computers will eventually become powerful enough to break the key’s protection. Cryptoperiods are there to manage the risk of compromised encryption. There are a number of key points to be aware of when dealing with cryptoperiods.

First of all, the timespan is usually calculated starting not in days or hours but in terms of cryptographic operations. So if you want to know how long a key should live in terms of elapsed time then you should calculate how many encryptions it can be used in (i.e. how many rows of data can the key be used to encrypt) and extrapolate from there.

The calculation for a cryptoperiod must account for a number of factors including the key type, the sensitivity of the data, the amount of time that the data originator requires access to the data, the amount of time that the data recipient requires access to the data as well as environmental factors from the operating environment (how secure is the server, operating system, application?) right up to staff turnover. As a rough guide, for a symmetric data encryption key protecting hundreds of records you could theoretically keep the data encryption key for as long as 3 years. At higher volumes of data you could be getting down to weeks.

I wish I could give even just an example calculation here but as far as I can tell this is an intentionally arbitrary concept used by standards such as PCI, FIPS and NIST to force a thought process and internal discussions. There are rough guidelines – such as the aforementioned weeks-to-years for data encryption keys – and as long as you adhere to these  and show your working out then you should be OK.

On avoiding re-encryption, I have come across a number of instances where it has been suggested that you may be able to negate the need to re-encrypt historical data with new DEKs by reducing the amount of data covered by a DEK to as low as 1:1. In theory this does make sense but having discussed this with a PCI QSA it is a non-starter if PCI DSS compliance is your goal. You either re-encrypt your data every 5 years’ as an absolute maximum, preferably within 1-3 years, or you delete it within that time.

One thing is absolutely clear however and that is at the end of a cryptoperiod the key should be securely destroyed and any data protected with that key should be re-encrypted with a new key or the data should itself be securely destroyed, the latter being most preferable if at all possible (datensparsamkeit).

Finally, as an FYI, there is some mention of the concept on Wikipedia but it is not very helpful. If you want in-depth detail on the subject then you are best turning to NIST and the indispensable 800-57 publication. That is a very dry and prolonged read but necessary in this matter, it is even directly referenced by PCI DSS 3.2.

And so we return to key rotation…

Key Rotation reprise

Once you know how long you are going to keep your keys you can implement your key rotation policies. Generally speaking these policies will be different for your DEK and your KEK. Your KEK may only require rotation every year while you will likely require a new DEK every ‘X’ number of encryptions performed as per your cryptoperiod calculation with any long-term records requiring re-encryption again with a new DEK every few weeks to years. For your KEK and DEK the process is similar in that you first create a new key, use that new key to encrypt your protected data then dispose of the old key. Where the processes differ of course is how and when this process is triggered. For your DEK you will likely have to count the number of encryptions it is involved in and renew when it exceeds a threshold while also scanning for historical records that are in need of re-encryption. Your KEK on the other hand will/should be held in a HSM or KMS service and this may or may not automatically cycle your KEK. It may be that you need to count your DEKs and request a new KEK on a threshold or you may need to handle an event message from the HSM/KMS that notifies when a KEK is being cycled and then update your stored (encrypted) DEK material.

One useful pattern to aid your future self is to store metadata about the data encryption context alongside your DEKs. Every row of data encrypted by a DEK will of course need to have a reference to that DEK so that your application know which DEK to use for decryption. Over time the size and type of the DEK used by your application will likely change to accommodate enhancements in encryption APIs and along with this you would expect the ciphers used will also change as computing power grows. Consider what will happen if you keep your protected data for long periods of time. The longer you keep your data, the more likely will be to have to update ciphers, such as moving from AES 256 to AES 512 or to a new algorithm altogether. To help deal with this, your application will benefit from having a record of exactly how each piece of data was encrypted. This can be stored alongside your DEK material as metadata and used by the application to make decisions about how to use the encrypted data and when to update it.

Crypto means ‘Cryptography’

Just needed to take this chance to get this point in: ‘Crypto’ means ‘Cryptography’. Anyone who tries to tell you otherwise is a shyster (I think they call them influencers these days) and they’re trying to sell you something I promise you don’t need or want.

Concepts of Compliant Data Encryption

Babeling in defence of JavaScript

And so it goes, the eternal question “What is wrong with JavaScript?” and the inevitable, inescapably droll, reply:

Oh, ho ho ha ha haaaaaaaaaaah… The gag never gets less funny. I need to be clear that Scott Hanselman is one of my favourite people in the public eye. I hold him to be an industry treasure and I’m fully aware of him just poking fun here but we’ve all seen this dialog before and we all know it is not always so lighthearted.

At the end of the day, these scenarios showing how ‘broken’ JavaScript is are almost always bizarrely contrived examples that can be easily solved with the immortal words of the great Tommy Cooper:

Patient: “Doctor, it hurts when I do this”
Doctor: “Well, don’t do it”

Powerful Facts

Lets be absolutely clear that JavaScript is an incredibly powerful language. It is the ubiquitous web programming language. Of course it currently has a monopoly that ensures this status. That does not change the fact that JavaScript runs on the fastest, most powerful and most secure websites. So clearly it does exactly what is needed when in the right hands.

JavaScript is free with a very low barrier to entry – all you need is a web browser.

JavaScript through its node.js guise powers Netflix, LinkedIn, NASA, PayPal… The list goes on and on.

Furthermore it is easy enough to learn and use that it is a firm favourite for beginners learning programming. It is in this last point that we observe some particularly harmful industry attitudes towards JavaScript.

What’s The Damage?

So now that we can all agree that Tommy Cooper has fixed JavaScript from the grave and now that we’re clear about just how seriously capable JavaScript is as a language, we can get onto the central point: industry attitudes to JavaScript are damaging. While many languages such as SQL and PHP are common targets of derision and it seems to me that each case has it’s own unique characteristics and nuances, there is something notably insidious about the way JavaScript is targeted.

One of the more painful examples of JavaScript’s negative press can be observed in the regular reports from those learning programming that they feel mocked for learning JavaScript. This is, quite frankly, appalling. We work in an industry that is suffering from a massive global undersupply of talent and we’re making potential recruits feel like crap. Well done team! Even globally established personalities such as Miguel de Icaza of Xamarin fame can’t help but fan these flames. What chance do new recruits have?

The JavaScript Apocalypse?

Moving on to the issue that prompted me to start writing this article; WebAssembly is here. It has a great website explaining all about it: webassembly.org. It even has a logo! It also has a bunch of shiny new features that promise to improve the experience of end users browsing the web.

WebAssembly logo
Of course WebAssembly has a logo!

From distribution, threading and performance improvements to a new common language with expanded data types, WebAssembly offers a bunch of improvements to the web development toolkit. I’m all for these changes! JavaScript and the web programming environment are far from perfect and these are another great step in the right direction.

Of course WebAssembly’s common language also promises to open up the web client for other programming languages. “Hurrah!” I hear many cheer. I’m seeing countless messages of support for the death of JavaScript at the hands of the obviously infinitely superior quality languages of C#, Rust and Java 🙄 Yeah… I’m not so sure…

Nah!

Like most programming languages, JavaScript is a product of its environment: namely, the web browser. It did have competition in the early days with VBScript back in IE4/5… I think… It was a long time ago. But otherwise it has developed on its own in response to demand from the web developer community and in response to the changing web landscape. The modern incarnations of JavaScript (ECMA Script 6/7/8) are incredibly powerful, including modern language features such as an async programming model, functional capabilities and so on. In many ways modern JavaScript resembles the languages to which it is so frequently compared but it also lacks many language features that are less relevant to web client programming such as generics and C#’s LINQ. It’s loose typing system make it well suited for working with the HTML DOM. Overall it would appear, as you might expect, that JavaScript is made for web client programming and is in fact the best choice for this task.

Even the WebAssembly project agrees, confirming on the project website that JavaScript will continue to be the ‘special’ focus of attention and you know what? This is a good thing!

Babel

Look, we already have other languages that compile for the web client but I don’t see any existential threat from the (albeit beautiful) CoffeeScript or from the (misguided) TypeScript. Sure, WebAssembly will make this more effective but the reasons that TypeScript hasn’t already taken over the web development world will still apply to C# and WebAssembly. We have seen a similar battle play out in the database world where NoSQL was lauded as the slayer of the decrepit 1970’s technology we all know as SQL. That was until NoSQL databases started to implement SQL. Turns out that SQL is hard to beat when it comes to querying data, which is unsurprising when you consider its 50-odd years of evolution in that environment and the same rule will apply to any JavaScript challengers. Personally I suspect a large part of JavaScript’s alternatives failing to take hold is that web client programming doesn’t need the added static typing, etc.; in my experience all these challengers do is introduce compiler warnings and complexity that waste time. Ultimately I don’t have all the answers here but it is fair to say that it would take a serious effort to out-web the language that has evolved for the web environment.

The Tower of Babel (from WikiPedia)

Where my real concern lies is in the well known problems that are brought about by having too much choice when it comes to communicating. We use human readable programming languages so that we can communicate our programs to each other. With that in mind it is clearly more effective in the long run if we all learn to talk the same language. The story of The Tower of Babel shows us that for a long time we have considered too much choice to be a very bad thing when it comes to communication.

It would be a frustrating situation indeed if we were to end up having to consider and manage the overhead of multiple languages for a single task all because of some daft attitudes towards JavaScript. Furthermore, businesses that are struggling to find web developers don’t now also need to worry about whether these developers are Rust, Java or C# web developers. JavaScript is the right tool for the job so lets stop wasting time with all the JavaScript bashing and get on board with an incredibly powerful language we can all understand!

Babeling in defence of JavaScript

A functional solution to interfacitis?

/ˈɪntəfeɪsʌɪtəs/
noun
noun: interfacitis
inflammation of a software, most commonly from overuse of interfaces and other abstractions but also from… well… actually it’s mostly just interfaces.

An illness of tedium

Over the years my experience has come to show me that unnecessary abstractions cause some of the most significant overheads and inertia in software projects. Specifically, I want to talk about one of the more tedious and time consuming areas of maintaining abstracted code; that which lies in the overzealous use of interfaces (C#/Java).

Neither C# or Java are particularly terse languages. When compared to F# with its Hindley-Milner type inference, working in these high-level OO languages often feels like filling out forms in triplicate. All too often I have experienced the already verbose syntax of these languages amplified by dozens of lengthy interfaces, each only there to repeat the exact signature of it’s singular implementation. I’m sure you’ve all been there. In my experience this is one of the more painful areas of maintenance, causing slowdowns, distraction and lack of focus. And I’ve been thinking for some time now that we’d probably be better off using interfaces (or even thin abstract classes) only when absolutely necessary.

What is necessary?

I like to apply a simple yard stick here: if you have a piece of application functionality that necessitates the ability to call multiple different implementations of a component then you probably require an interface. This means situations such as plugins or provider-based architectures would use an interface (of course!) but your CustomerRegistrationService that is called only by your CustomerRegistrationController will not. The message is simple, don’t start introducing unnecessary bureaucracy for the sake of it.

There are, I admit, cases where you might feel abstraction is required. What about a component that calls out to a third party system on the network? Surely you want to be able to isolate this behind an interface? And so I put it to you; why do you need an interface here? Why not use a function? After all, C# is now very well equipped with numerous, elegant functional features and many popular DI frameworks support delegate injection. Furthermore if you are following the SOLID practice of interface segregation then chances are your interface will contain only one or two method definitions anyways.

An example

So, for those times when you absolutely must abstract a single implementation, here is a simple example of an MVC controller using ‘functional’ IoC:

public class RegistrationController : Controller
{

    private readonly Func<string, RegistrationDetails> _registrationDetailsQuery;

    public RegistrationController(Func<string, RegistrationDetails> registrationDetailsQuery)
    {
        _registrationDetailsQuery = registrationDetailsQuery;
    }

    public ActionResult Index()
    {
        var currentRegistration = _registrationDetailsQuery(User.Identity.Name);

        var viewModel = ViewModelMapper.Instance
            .Map<RegistrationDetails, RegistrationDetailsViewModel>(currentRegistration);

        return View(viewModel);
    }
}

 

Addendum

13-March-2018: It has been pointed out to me that a further benefit of this approach is that static providers may also supply IoC dependencies whereas instances are required for interface-based IoC. What are your thoughts on this approach?

A functional solution to interfacitis?

Demystifying AI – The AI explosion

This is an article I had originally written as part of a stream of work that has now been put on hold indefinitely. I thought it a shame for it to languish in OneNote.

What’s with all this attention to Artificial Intelligence then?

Well that is a very good question. To be perfectly frank, not that much has changed of late in the world of Artificial Intelligence (AI) as a whole that should justify all the current excitement. That’s not to say that there isn’t cool stuff going on; there really is great progress being made… in the world of Machine Learning. And if we are to begin the process of ‘Demystifying AI’ then this is a very good place to start.

 

AI is a very broad area of technology encompassing research from robotics to computing emotions (affective computing) and everything in between, including Machine Learning or ‘ML’. As alluded to just a moment ago it is within ML specifically that we are seeing the greatest progress. Think of a modern ‘AI’ technology that is gaining a lot of attention and you can place a safe bet on it specifically using ML techniques: Natural Language Processing? That’s ML. Image classification? That’s ML. Sentiment Analysis? Also ML. The recent news of Go players being defeated by a computer? You guessed it… ML.

 

What is ML?

ML is an approach to analysing data that is based on training statistical models to predict outcomes. You may well have come across Statistical (or Linear) Regression back in your school days; well this is possibly the best known example from a range of techniques that make up the world of ML. To put it simply, an ML model learns from past data to make better decisions in the future.

mlaidl

Now it’s time to introduce what is arguably the beating heart of the AI frenzy: Deep Learning. While there are no trendy acronyms for Deep Learning it is fair to say that Deep Learning has become a bit of a buzz-word itself. Deep Learning takes its name from the concept of Deep Neural Networks (or DNNs, there’s your acronym!). The useful details of what DNNs are and how they function cannot be easily summarized, suffice it to say that DNNs are an ML technology that borrows heavily from the structure of the brain, hence the ‘neural’ part of the name. [N.B. These details are already planned for a follow-up piece.] To re-cap: Deep Learning is a subset of ML, which is in-turn a subset of AI and it is Deep Learning that drives the current hype.

 

What is Deep Learning?

Say you wanted to build some software to identify objects in an image; your usual non-Deep-Learning approach to this would include manually writing rules into the software to recognise the details you’re looking for. If you wanted to identify if a picture was of a bird or a cat, you would manually write rules to identify features such as whiskers or ears or wings and so on. This is complicated, time-consuming and error prone. Deep Learning takes a different approach. Instead you would create a Deep Learning model then supply it with a bunch of pictures. For each picture you supply, you would tell the model if it was of a bird or a cat. As you supply each image and it’s corresponding label, the model learns. Once enough data has been supplied you can then supply an image without a label and the model will give an accurate indication of whether it is a bird or a cat.

 

So what is the ‘explosion’ all about?

Continuing the bird/cat model example, the more example labelled pictures you supply to the model, the better the results will be. This seems simple and even somewhat obvious but it strikes at the heart of the current ‘AI boom’. Deep Learning has been around for a while now, evolving over a period of 30 years more or less, and one of the key reasons that it has never been so commercially successful as now is that there just hasn’t been enough readily available data to make it so accessible. To give you some idea of why this has been an issue, if you want to get to a high level of accuracy for classifying complex pictures then you’re going to need  thousands or even millions of examples depending on the complexity. Well we now have data, lots and lots and lots of it and it has never been more easy to get our hands on it. Do a quick Google image search for ‘Cat’; there is a rough cut of half your ‘training’ set (*ahem* copyright issues aside) and I’m sure you can figure out how to get the other half.

 

So we have data, but that isn’t all we need. The other side of the current explosion is raw computing power. Building a statistical model that can accurately identify cats and birds in pictures is very heavy work for a computer but thankfully with the advent of cloud-scale computing resources, available computing power is now big enough and cheap enough to make running this sort of model both practical and cost-effective. It’s cheap enough that Google can even give this stuff away as an educational toy (https://teachablemachine.withgoogle.com/).

 

So its all about pictures of cats and birds?

Beyond the abundance of data and computing power, probably the most significant factor in the commercial success of Deep Learning is its versatility. This is especially true when considering the success of Deep Learning against other ML techniques which have not gained the same level of attention. If you have enough data, regardless of its form, Deep Learning can be trained to extract knowledge from it. This has sent businesses, scientists and engineers into a global flurry of R&D to find all the amazing ways in which this technology can enhance our lives.

For years now the financial services industry has been at the forefront of applying ML techniques to everything from fraud prevention and risk management to investments and savings predictions; there are few – if any – areas of the industry that have yet to see the benefits of AI.

Manufacturing is seeing growing uptake in the application of ML to improve efficiency through waste reduction and better predictive analysis of production demands and infrastructure maintenance.

More recently utilities are beginning to get into the ML game with the UK National Grid striking up discussions with Google to investigate applying the infamous DeepMind AI to maximise National Grid’s use of renewables and to more efficiently balance supply and demand across its nationwide infrastructure.

Across all sectors business now find themselves in a position to use ML to better understand and engage with their customers. From utilities gaining greater knowledge of their customer’s consumption habits through to retailers and service providers more effectively capturing sales conversion opportunities, the possibilities are as varied as your data.

 

Would you like some knowledge with that?

So that concludes this effort to clear away some of the fog and hyperbole from the current AI phenomenon (ahem! It’s all ML, remember!?). In a nutshell, if you have a ton of data and you need to get knowledge from it then Deep Learning could well be your go-to tool.

Demystifying AI – The AI explosion

FileFormatException: Format error in package

OK so we’re all completely clear on what this error means and what must be done to resolve it right? I mean with a meaningful error like that how can anyone be mistaken? Oh? What’s that? You still don’t know? Let’s be a bit more specific: System.IO.FileFormatException: Format error in package Better? Didn’t think so. It’s not an error message, that’s why. I’ll tell you what it is though, it’s stupid and even more stupid when you find out what causes it.

I came across this delightfully wishy-washy error when configuring an Umbraco 7 deployment pipeline in TeamCity and Octopus Deploy. The Umbraco .csproj MSBuild file referenced a bunch of files as you might expect but I also needed to add a .nuspec file which referenced a bunch of other files. Long-story-short, the error came about because the files specified by the .csproj overlapped with the files specified by the .nuspec file. There were about 1000-odd generated files that the NuGet packaging components in their infinite wisdom added to the .nupkg archive as many times as they were referenced. NuGet was able to do this silly thing without any complaints and inspecting the confused package in NuGet Package Explorer or 7Zip or Windows Zip gave no indication of any issues whatsoever. It was not until Octopus called on NuGet to unpack the archive for deployment that we got the above error.

Stupid, right? Stupid!

FYI: I was able to get to the bottom of this issue after 2 freaking days of pain when I eventually used JetBrains dotPeek to debug step-through the NuGet.Core and System.IO.Packaging components to see what on earth was going on. In the end it was this piece of code in System.IO.Packaging.Package that was causing the issue:

public PackagePartCollection GetParts()
{
...
	PackagePart[] partsCore = this.GetPartsCore();
	Dictionary dictionary = new Dictionary(partsCore.Length);
	for (int index = 0; index < partsCore.Length; ++index)
	{
	  PackUriHelper.ValidatedPartUri uri = (PackUriHelper.ValidatedPartUri) partsCore[index].Uri;
	  if (dictionary.ContainsKey(uri))
		throw new FileFormatException(MS.Internal.WindowsBase.SR.Get("BadPackageFormat"));
	  dictionary.Add(uri, partsCore[index]);
	  ...
	}
...
}

I mean, why would anyone consuming such a core piece of functionality as this API ever want to know anything about the conditions that led to the corruption of a 30MB package containing thousands of files? I mean it’s not like System.IO.Packaging was ever intended to be re-used all across the globe, right?

Anyways, here’s the error log for helping others with searching for this error and stuff.

[14:21:27]Step 1/1: Create Octopus Release
[14:21:27][Step 1/1] Step 1/1: Create Octopus release (OctopusDeploy: Create release)
[14:21:27][Step 1/1] Octopus Deploy
[14:21:27][Octopus Deploy] Running command:   octo.exe create-release --server https://octopus.url --apikey SECRET --project client-co-uk --enableservicemessages --channel Client Release --deployto Client CI --progress --packagesFolder=packagesFolder
[14:21:27][Octopus Deploy] Creating Octopus Deploy release
[14:21:27][Octopus Deploy] Octopus Deploy Command Line Tool, version 3.3.8+Branch.master.Sha.f8a34fc6097785d7d382ddfaa9a7f009f29bc5fb
[14:21:27][Octopus Deploy] 
[14:21:27][Octopus Deploy] Build environment is NoneOrUnknown
[14:21:27][Octopus Deploy] Using package versions from folder: packagesFolder
[14:21:27][Octopus Deploy] Package file: packagesFolder\Client.0.1.0-unstable0047.nupkg
[14:21:28][Octopus Deploy] System.IO.FileFormatException: Format error in package.
[14:21:28][Octopus Deploy]    at System.IO.Packaging.Package.GetParts()
[14:21:28][Octopus Deploy]    at System.IO.Packaging.Package.Open(Stream stream, FileMode packageMode, FileAccess packageAccess, Boolean streaming)
[14:21:28][Octopus Deploy]    at System.IO.Packaging.Package.Open(Stream stream)
[14:21:28][Octopus Deploy]    at NuGet.ZipPackage.GetManifestStreamFromPackage(Stream packageStream)
[14:21:28][Octopus Deploy]    at NuGet.ZipPackage.c__DisplayClassa.b__5()
[14:21:28][Octopus Deploy]    at NuGet.ZipPackage.EnsureManifest(Func`1 manifestStreamFactory)
[14:21:28][Octopus Deploy]    at NuGet.ZipPackage..ctor(String filePath, Boolean enableCaching)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.PackageVersionResolver.AddFolder(String folderPath)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.CreateReleaseCommand.c__DisplayClass1_0.b__5(String v)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.OptionSet.c__DisplayClass15_0.b__0(OptionValueCollection v)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.OptionSet.ActionOption.OnParseComplete(OptionContext c)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.Option.Invoke(OptionContext c)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.OptionSet.ParseValue(String option, OptionContext c)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.OptionSet.Parse(String argument, OptionContext c)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.OptionSet.c__DisplayClass26_0.b__0(String argument)
[14:21:28][Octopus Deploy]    at System.Linq.Enumerable.WhereArrayIterator`1.MoveNext()
[14:21:28][Octopus Deploy]    at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
[14:21:28][Octopus Deploy]    at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.OptionSet.Parse(IEnumerable`1 arguments)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.Options.Parse(IEnumerable`1 arguments)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Commands.ApiCommand.Execute(String[] commandLineArguments)
[14:21:28][Octopus Deploy]    at Octopus.Cli.Program.Main(String[] args)
[14:21:28][Octopus Deploy] Exit code: -3
[14:21:28][Octopus Deploy] Octo.exe exit code: -3
[14:21:28][Step 1/1] Unable to create or deploy release. Please check the build log for details on the error.
[14:21:28][Step 1/1] Step Create Octopus release (OctopusDeploy: Create release) failed

Update – 19th September 2018

I have just this morning helped a colleague through a permutation of this issue. We have recently upgraded TeamCity and it seems this has pushed the issue further down the pipeline. Where the above error would appear in TeamCity during packaging, it seems that the components have been updated to no longer throw the obscure error. My colleague found this issue now manifests when attempting to deploy through Octopus, throwing up the error: “Unable to download package: Item has already been added. Key in dictionary: …”

As before, here is a slightly redacted log to help with searching:

Acquiring packages
Making a list of packages to download
Downloading package CLIENT_NAME.Web version 6.0.0-beta0000 from feed: 'https://teamcity.CLIENT_NAME.com/httpAuth/app/nuget/v1/FeedService.svc/'
Unable to download package: 
Item has already been added. Key in dictionary: 'assets/fonts/fsmatthew-light-webfont.svg'  Key being added: 'assets/fonts/fsmatthew-light-webfont.svg'
System.ArgumentException: Item has already been added. Key in dictionary: 'assets/fonts/fsmatthew-light-webfont.svg'  Key being added: 'assets/fonts/fsmatthew-light-webfont.svg'
FileFormatException: Format error in package

Crash debugging Windows 10 Mobile UWP apps

So your app is crashing

This post explains how to get the details of the root managed .NET exception of a crash in a Windows 10 UWP app, specifically on the Windows 10 Mobile ARM platform. Hopefully this post will save you from some of the pain that I endured and aid you in getting to the bottom of your crashing app. Also note that with some minor differences – that I shall include as best I can in this article – this should in theory also work for debugging any UWP store apps on the x86 and x64 Windows 10 platforms although I have not tested this.

I’ll not detail the complete end-to-end process here as it is varied and lengthy and the core of the process is covered in excellent detail in a two-part series of posts by Andrew Richards of the Microsoft NTDebugging blog: ‘Debugging a Windows 8.1 Store App Crash Dump‘ Part 1 and Part 2.

The issue I found with following that series of posts alone is they are missing some key information if you are working on the Windows 10 UWP platform. No surprise when you consider that they were intended for Windows 8.1 Store platform. But they are full of essential details for the similar parts of the process on Windows 10 UWP and they got me so close!

In this post I will detail the information that is not already available in the above posts and how it fits into the overall process of debugging crash dumps from UWP apps running on the Windows 10 Mobile platform.

Enable and collect crash dumps

First off, make sure that Windows 10 Mobile will collect crash dumps by heading to Settings -> Update & Security -> For developers and ensure that the value of the settings labelled ‘Save this many crash dumps’ is greater than 0. I’d recommend at least 3.

Now reproduce the crash a couple of times to generate the crash dumps. The dump files should now be available under your device’s \Documents\Debug directory on the device storage. Note that it can take a few minutes to completely save the dump files and if you see any files here named ‘SOMETHING.part’ then the dumps are still being saved so come back in a minute or two. Move the dump files onto the machine where the debugging will take place.

Now on to the experts

Now I’ll pass you over to the aforementioned articles which explain how to fire up the dump files in WinDbg. Just as a heads up that at the time of writing this the latest Windows 10 Debugging Tools (WinDbg) are available from here.

If your crashes are indeed caused by managed code that you have written, generated or otherwise included then you will inevitably end up being directed to use SOS to elicit the details of the exception that is being thrown. This is where things got tricky for me and if you do get to this point then return here and read on…

Filling in the gaps

Now that you may have tried invoking, loading and even locating SOS and the CLR or DAC modules, I can tell you that these components are not where or even what the article describes. First of all I spent some time trying to confirm that the CLR or DAC was loaded as it should be according to most sources on this subject. Eventually after much trial and error I tried issuing a reload command to ensure the correct core framework was loaded. This is done with the following command (see documentation here). Also note that this step might not be necessary for you.

.cordll -ve -u -l

Which, for me, results in the following output:

CLRDLL: Unable to find 'mrt100dac.dll' on the path
Automatically loaded SOS Extension
CLRDLL: Loaded DLL c:\symbols\mrt100dac_winarm_x86.dll\561408BF43000\mrt100dac_winarm_x86.dll
CLR DLL status: Loaded DLL c:\symbols\mrt100dac_winarm_x86.dll\561408BF43000\mrt100dac_winarm_x86.dll

This is all fine but is not quite what I expected to see and leads onto the issue with using SOS. As you can see, SOS is supposedly loaded by the above command but normal SOS commands/invocations still will not work. The above DLLs gave me some clue to what was going on here and when I looked to see where this mrt100dac_winarm_x86.dl was located it led me to find the SOS DLL. In my environment, everything can be found here: C:\Program Files (x86)\MSBuild\Microsoft\.NetNative\arm and I can see here a DLL named mrt100sos.dll and a few variants thereof. So it looks as if there is a special distribution of SOS for the Universal platform, which makes sense.

NOTE: This is where the differences between the different platforms (ARM, x86, x64) will come into play. I suspect that this should be the same process for debugging UWP apps on all platforms but I cannot say for certain. At the very least you will see different modules/DLLs listed above and the different platform modules can all be found under: %Program Files%\MSBuild\Microsoft\.NetNative,

Armed with this knowledge I then headed back to Google and thankfully (luckily!) found one mention of using mrt100sos on a blurb for a non-existent Channel 9 show:

…This is very similar to how CLR Exceptions are discovered. Instead of using SOS, MRT uses mrt100sos.dll or mrt100sos_x86.dll (depending on the target). The command is !mrt100sos.pe -ccw <nested exception> . The same command(s) for CLR Exceptions is !sos.dumpccw <addr> –> !sos.pe <managed object address>.

And sure enough if you follow on from Andrew’s Windows 8.1 Store App articles with the above commands you will be able to see your managed exception in all its detailed beauty. In the following example <Exception Address> would be the value of ExceptionAddress or NestedException in your WinDbg output:

!mrt100sos.pe -ccw <Exception Address>

As an example, I had the following WinDbg output:

0:005> dt -a1 031df3c8 combase!_STOWED_EXCEPTION_INFORMATION_V2*
[0] @ 031df3c8
---------------------------------------------
0x008c2f04
+0x000 Header           : _STOWED_EXCEPTION_INFORMATION_HEADER
+0x008 ResultCode       : 80131509
+0x00c ExceptionForm    : 0y01
+0x00c ThreadId         : 0y000000000000000000001111001100 (0x3cc)
+0x010 ExceptionAddress : 0x7778afbb Void
+0x014 StackTraceWordSize : 4
+0x018 StackTraceWords  : 0x19
+0x01c StackTrace       : 0x008c67e8 Void
+0x010 ErrorText        : 0x7778afbb  "¨滰???"
+0x020 NestedExceptionType : 0x314f454c
+0x024 NestedException  : 0x008d09a0 Void

Taking the above NestedException address, I end up with the following command and resulting output. And this was all I needed to locate the bug.

0:005> !mrt100sos.pe -ccw 0x008d09a0
Exception object: 00f132ac
Exception type:   System.InvalidOperationException
Message:          NoMatch
InnerException:   <none>
StackTrace (generated):
IP       Function
65c279d5 ProblemApp_65810000!$51_System::Linq::Enumerable.First<System.__Canon>+0x99
65c27729 ProblemApp_65810000!$2_ProblemApp::Utilities::AssetsCache::<loadImage>d__4.MoveNext+0xa5
00000001
65539115 SharedLibrary!System::Runtime::ExceptionServices::ExceptionDispatchInfo.Throw+0x19
65539317 SharedLibrary!$13_System::Runtime::CompilerServices::TaskAwaiter.ThrowForNonSuccess+0x4b
655392c5 SharedLibrary!$13_System::Runtime::CompilerServices::TaskAwaiter.HandleNonSuccessAndDebuggerNotification+0x41
6553927d SharedLibrary!$13_System::Runtime::CompilerServices::TaskAwaiter.ValidateEnd+0x19
654ccea1 SharedLibrary!$13_System::Runtime::CompilerServices::TaskAwaiter$1<System::__Canon>.GetResult+0x11
65cf0285 ProblemApp_65810000!$2_ProblemApp::Utilities::AssetsCache::<Initialize>d__2.MoveNext+0x175
…

Would love to RTFM!

So hopefully this will help some poor souls who like me have to debug crashing Windows 10 Mobile UWP apps. If anyone knows of some proper documentation for the mrt100sos commands I would be eternally grateful!

Crash debugging Windows 10 Mobile UWP apps

LinkedIn Error “There was a problem sharing your update. Please try again”.

Obscure Error

I was trying to reply to a comment on an article I posted to LinkedIn the other day and kept hitting the error “There was a problem sharing your update. Please try again”. Just a note to help anyone who might come across this error when attempting to post an update to LinkedIn, there is an unadvertised comment character limit of 800 characters.

A little help?

It would be great if this was made obvious somewhere such as in the error itself or at least somewhere on the site but even searching the Internet for “There was a problem sharing your update. Please try again” didn’t turn up much for me. It wasn’t until I opened a support ticket that I was given this info.

I hope posting this here will at some point in the future save someone from wasting the time I did.

LinkedIn Error “There was a problem sharing your update. Please try again”.

Things I wish I knew 10 years ago: Abstractions

We need to talk about abstractions

The main reason I decided to start this blog is that I have begun working for a company that has genuinely challenged many of my assumptions about how software should be developed. I have spent much of my career learning from the more prominent voices in software development about how to write software effectively. I have learned, practiced and preached the tenets of clean code, TDD, layered design, SOLID, to name a few of the better known programming practices and had always believed that I was on a true path to robust, maintainable software. Now I find myself in a position where over the space of just one year I have already questioned many of the practices I had learned and taught in the preceding decade.

I hope to share on this blog much of what I have discovered of late but for my first entry discussing programming practices I want to talk about abstractions. In particular I want to call into question what I have come to understand as overuse of abstractions – hiding implementations away in layers/packages, behind interfaces, using IoC and dependency inversion – as often encountered in the C#/.NET and Java world.

Abstractions?

I have been wondering lately if I have simply spent years misunderstanding and misapplying abstractions, but I have seen enough code written by others in books, tutorials, blogs, sample code and more diagrams than I can bear to know that I have not been alone in my practices. Furthermore, I have found myself on a few occasions of late in discussions with developers of similar experience who have come to share a similar feeling towards abstractions.

The all too familiar layer diagram
The all too familiar layer diagram. © Microsoft. https://msdn.microsoft.com/en-us/library/ff648105.aspx

A typical layering structure
A typical layering structure

So what do I mean by abstractions and what is the point of them, really? The old premise and the one that I would always reiterate is that abstractions help enforce separation of concerns (SoC) by isolating implementation details from calling code. The reasoning being that code of one concern should be able to change without affecting the code dealing with other concerns, supposedly because code dealing with one concern will change for different reasons and at different times than the code dealing with other concerns. Of course we mustn’t forget that one of the more natural causes of abstractions is the isolation of logic to enable Unit Testing. Ultimately the result is that software is written in such a way that the different code dealing with different concerns is kept separate by abstractions such as interfaces and layers while making use of IoC and Dependency Injection to wire the abstractions together. Furthermore it is worth me stating that the usual separate ‘concerns’ touted by such advocacy frequently includes Presentation/UI, Service/Application Logic, Business Logic, Data Access Logic, Security, Logging, etc.

[Authorize]
public class StudentController : Controller
{

    private readonly IStudentRepository _repository;
    private readonly IStudentService _service;
    private readonly IUnitOfWork _unitOfWork;

    public StudentController
    (
        IStudentRepository repository, 
        IStudentService service, 
        IUnitOfWork unitOfWork
    )
    {
        _repository = repository;
        _service = service;
        _unitOfWork = unitOfWork;
    }

    public ActionResult UpdateStudentDetails(StudentDetailsViewModel model)
    {
        if (ModelState.IsValid)
        {
            var student = _repository.Get(model.StudentId);

            student.Forename = model.Forename;
            student.Surname = model.Surname;
            student.Urn = model.Urn;

            _service.UpdateStudentDetails(student);

            _unitOfWork.Commit();
        }

        return View(model);
    }
}

Abstracted code, obscurity through indirection.

YAGNI!

I am not about to start claiming that everything should just be thrown together in one Big Ball of Mud. I still feel that SoC certainly is worth following but it can be effectively achieved by applying simple encapsulation, such as putting more repetitive and complex logic of one concern within its own class so that it may be repeatedly invoked by code dealing with other concerns. An example of this would be the code to take an entity key, fetch and materialize the correlating entity from a data store and return it to the caller. This would be well served in a method of a repository class that can be called by code that simply needs the entity. Of course packages/libraries also have their place, in sharing logic across multiple applications or solutions.

Where I see problems starting to arise is when, for example, the aforementioned repository is hidden behind an interface, likely in a separate layer/package/library and dynamically loaded by an IoC infrastructure at runtime. Let’s not pull any punches here, this practice is hiding significant swathes of software behind a dynamic infrastructure which is only resolved at runtime. With the exception of some very specific cases, I see this practice as overused, unnecessarily complex and lacking in the obvious transparency that code must feature to be truly maintainable. The problem is further compounded by the common definition of the separate concerns and layers themselves. Experience has shown me that when coming to maintain an application that makes use of all of these practices you end up with a voice screaming in your head “Get the hell out of my way!”. The abstractions don’t seem to help like they promise and all of their complexity just creates so much overhead that slows down debugging and impedes changes of any significant proportion.

With one exception I have never spoken to anyone who has ever had to swap out an entire layer (i.e. UI, Services, Logic, Data Access, etc.) of their software. I’ve personally been involved in one project where it was required but it was a likely eventuality right from the start and so we were prepared for it. I have rarely seen an example of an implementation of an abstraction being swapped or otherwise significantly altered that did not affect its dependents, regardless of the abstraction. Whenever I have seen large changes made to software it very rarely involves ripping out an entire horizontal layer, tier or storage mechanism. Rather it will frequently involve ripping out or refactoring right across all layers affecting in one change the storage tables, the objects and logic that rely on those tables and the UI or API that relies on those objects and logic. More often than not large changes are made to a single business feature across the entire vertical stack, not a single conceptual technical layer and so it stands to reason that should anything need separating to minimise the impact of changes it should be the features not the technical concerns.

Invest in reality

So my main lesson here is that: The reality of enforcing abstractions through layering and IoC is very different from the theory and usually is not worth it, certainly when used to separate the typical software layers. With the exception of cases such as a component/plug-in design I am now completely convinced that the likelihood of layered abstractions and IoC ever paying off is so small it just isn’t worth the effect that these abstractions have on the immediate maintainability of code. It makes sense in my experience not to focus on abstracting code into horizontal layers and wiring it all up with IoC but to put that focus into building features in vertical slices, with each slice organised into namespaces/folders within the same project (think MVC areas and to a lesser extent the DDD Bounded Context). Spend the effort saved by this simplification keeping the code within the slices clear, cohesive and transparent so that it is easy for someone else to come along, understand and debug. I’d even go so far as to try to keep these slices loosely dependent on each other – but not to the point that you make the code less readable, i.e. don’t just switch hard abstractions of layers into hard abstractions of slices. I don’t want to offend anyone, I’m just putting my experience out there… why not give this a try… I promise you probably won’t die.

Vertical slices with MVC Areas
Vertical slices with MVC Areas

Take a look at the following updated controller action. You know almost exactly what it is doing just by looking at it this one method. This contains ALL of the logic that is executed by the action and to anyone first approaching this code they can be confident in their understanding of the logic without having to dig through class libraries and IoC configuration. Any changes that are made to the action would simply be made here and in the DB project, so much more maintainable! Being completely honest, even recently, seeing code written like this would rub me up the wrong way so I understand if this gets some others on edge but I’ve come full circle now and am pretty convinced of the simplified approach. And its this dichotomy I’d like to discuss.

[Authorize]
public class StudentsController : Controller
{
    public ActionResult UpdateStudentDetails(StudentDetailsViewModel model)
    {
        if (ModelState.IsValid)
        {
            using (var context = new StudentsContext())
            {
                var student = context.Students.Single(s => s.Id == model.StudentId);

                student.Forename = model.Forename;
                student.Surname = model.Surname;
                student.Urn = model.Urn;

                SendStudentDetailsConfirmationEmail(student);

                context.SaveChanges();
            }
        }

        return View(model);
    }

    private void SendStudentDetailsConfirmationEmail(Student student)
    {
        ...
    }
}

Transparent, maintainable, intention-revealing code and no need for IoC!

This is just an opening

So this has been my first attempt to open up some conversation around the use of abstractions in software. I’ve tried to keep it brief and in doing so I’ve only just scratched the surface of what I have learned and what I have to share. There is still so much more for me to cover regarding what I and others I know in the community have been experiencing in recent years: Should we abstract anything at all? What is maintainable if not SoC via IoC? How do we handle external systems integration? What about handling different clients sharing logic and data (UI, API, etc.)? How does this impact self/unit-testing code? When should we go the whole hog and abstract into physical tiers? I could go on… So I intend to write further on this subject in the coming weeks and in the meantime it would be great to hear if anyone has any thoughts on this, good or bad! So drop me a line and keep checking back for further posts.

Things I wish I knew 10 years ago: Abstractions