8 stories

A data breach investigation blow-by-blow

1 Share

Sponsored by: Barkly - Worried about ransomware in 2017? Learn how runtime malware defense blocks ransomware before it does harm.

A data breach investigation blow-by-blow

Someone has just sent me a data breach. I could go and process the whole thing, attribute it to a source, load it into Have I been pwned (HIBP) then communicate the end result, but I thought it would be more interesting to readers if I took you through the whole process of verifying the legitimacy of the data and pinpointing the source. This is exactly the process I go through, unedited and at the time of writing, with a completely unknown outcome.

Warning: This one is allegedly an adult website and you're going to see terms and concepts related to exactly the sort of thing you'd expect from a site like that. I'm not going to censor words or links other than for privacy purposes; this is exactly what I go through during the verification process so you're going to get the whole thing, blow-by-blow.

The file I've been sent is a 120MB zip called "Eroticy.com_June_2015.sql.zip". The sender of the file has said it's from eroticy.com and that's all I have to go on. I extract the file and find an 841MB MySQL script which I open up in Sublime Text which is pretty good at reading massive files:

A data breach investigation blow-by-blow

Ok, so a MySQL script file, fairly typical. I take a quick spin through and it's the usual trove of insert statements. I want a quick count of just how many email addresses are in the thing though so I point a little app I wrote at it to extract them all (it does some basic parsing and other checks):

A data breach investigation blow-by-blow

That's a sizeable result, nearly 1.4M unique email addresses (a bunch of them appear multiple times based on the output above). If it was small (say, under 100k records), I may not have bothered and moved onto something more sizable. It's pretty much the same effort for me regardless of size and bigger breaches impact more people so this helps me prioritise.

Time to check out the site, VPN and Incognito browser first thank you very much (probably make sure the kids aren't around too...):

A data breach investigation blow-by-blow

Alright, so this is a bit volatile because it's not just a porn site, it's dealing with fantasies too which is really personal stuff. In fact, the site has redirected to dating.eroticy.com so it looks like it's designed to facilitate physical encounters. Let's see if I can easily tie the data to this site.

I go to the exported email addresses and grab a random Mailinator email address. I've written before about why these are useful and it's pretty much the first thing I do these days. Then it's off to the password reset form, but I begin by just fat-fingering the keyboard:

A data breach investigation blow-by-blow

I want to see if the site has an enumeration risk which will confirm the address doesn't exist. Instead, the site responds as it should respond:

A data breach investigation blow-by-blow

But by using a Mailinator address which the creator definitely knows is a public mailbox, I can then see if the mail is actually delivered. I submit it then check Mailinator:

A data breach investigation blow-by-blow

Huh, no email, but there's a definite pattern to the types of email the address has been receiving (although we all get plenty of porn spam). I try a few others and still no email. Normally about now I'd be seeing password reset emails and that would be enough to be pretty confident it's going to be a legit breach.

Let's try another enumeration vector:

A data breach investigation blow-by-blow

Most systems will tell you if the address already exists during registration. But here's what Eroticy does:

A data breach investigation blow-by-blow

So either the account doesn't exist in the system or they're actively avoiding enumeration via registration. Checking Mailinator should tell me:

A data breach investigation blow-by-blow

This is getting interesting. Looks like the account created just fine which is telling me that this address doesn't exist in their system already. I need to take a much closer look at the raw data because as it stands, as real as the data itself looks, no practical evidence is suggesting it came from Eroticy.

I do a bit of searching around too, including on vigilante.pw which has a pretty comprehensive list of alleged breaches. Eroticy is there:

A data breach investigation blow-by-blow

I find other references in various other shady corners of the web too so at the very least, it's data that's been redistributed and branded as a breach. But none of it gives me any confidence it's actually legitimate in terms of the data actually having been sourced from Eroticy which is pretty much the point of going through this exercise here.

I want to be able to examine the data more thoroughly so I fire up a VM running MySQL and import the entire thing. This will make it easier to examine the schema as well as query the data.

While I'm waiting for the data to import, I look more closely at the raw statements within Sublime. I'm seeing a lot of URLs represented in there such as http://cartooncopulations.com/index.php/ft_15666_A_8a2b3c0c52d284c0ad8b604ec820cfac/ccoconsole1.html and http://monstrousmelons.com/index.php/ps_16568_A_6219246bcd03eda402719b3a68f3389b/main.html and a bunch of other far more explicitly titled domain names. The same domains appear many times over in a column called "ref" next to a date in a field called "day". Thing is, the dates are frequently in the 2004 era and a bunch of the domains are either dead or link to the same site with different branding:

A data breach investigation blow-by-blow

A data breach investigation blow-by-blow

The sites aren't explicit either - there's nothing you won't see on most beaches there - and they seem to be there primarily to drive traffic to other places.

The MySQL data is still loading so I look around the raw statements a little more. I find a table called EpochTransStats;

  ets_transaction_id int(11) NOT NULL DEFAULT '0' ,
  ets_member_idx int(11) NOT NULL DEFAULT '0' ,
  ets_transaction_date datetime ,
  ets_transaction_type char(1) NOT NULL DEFAULT '' ,
  ets_co_code varchar(6) NOT NULL DEFAULT '' ,
  ets_pi_code varchar(32) NOT NULL DEFAULT '' ,
  ets_reseller_code varchar(64) DEFAULT 'a' ,
  ets_transaction_amount decimal(10,2) NOT NULL DEFAULT '0.00' ,
  ets_payment_type char(1) DEFAULT 'A' ,
  ets_pst_type char(3) NOT NULL DEFAULT '' ,
  ets_username varchar(32) ,
  ets_password varchar(32) ,
  ets_email varchar(64) ,
  ets_ref_trans_ids int(11) ,
  ets_password_expire varchar(20) ,
  ets_country char(2) NOT NULL DEFAULT '' ,
  ets_state char(2) NOT NULL DEFAULT '' ,
  ets_postalcode varchar(32) NOT NULL DEFAULT '' ,
  ets_city varchar(64) NOT NULL DEFAULT '' ,
  ets_street varchar(80) NOT NULL DEFAULT '' ,
  ets_ipaddr varchar(16) NOT NULL DEFAULT '' ,
  ets_firstname varchar(32) NOT NULL DEFAULT '' ,
  ets_lastname varchar(32) NOT NULL DEFAULT '' ,
  ets_user1 varchar(32) NOT NULL DEFAULT '' ,
  PRIMARY KEY (ets_transaction_id),
   KEY idx_reseller (ets_reseller_code),
   KEY idx_product (ets_pi_code),
   KEY idx_transdate (ets_transaction_date),
   KEY idx_type (ets_transaction_type)

It piques my interest because epoch seems like an usual name in this context so I give it a Google. That leads me to a page on a GitHub repository called Elite-Adult-Affiliate-Program. This is interesting because stuff is starting to line up: adult website, bunch of links to other sites and the schema containing a name that's represented in a project that seems to support adult site affiliate links.

There's another table in the dump called "console_links" so I search the GitHub repository for that but come up empty. But then I search for "EpochTransStats" and find a file called affiliateprogram.sql. There's a create statement in there for the EpochTransStats table and it has a heap of common columns:

CREATE TABLE `EpochTransStats` (
  `ets_transaction_id` int(11) NOT NULL default '0',
  `ets_member_idx` int(11) NOT NULL default '0',
  `ets_transaction_date` datetime default NULL,
  `ets_transaction_type` char(1) NOT NULL default '',
  `ets_co_code` varchar(6) NOT NULL default '',
  `ets_pi_code` varchar(32) NOT NULL default '',
  `ets_reseller_code` varchar(64) default 'a',
  `ets_transaction_amount` decimal(10,2) NOT NULL default '0.00',
  `ets_payment_type` char(1) default 'A',
  `ets_username` varchar(32) default NULL,
  `ets_password` varchar(32) default NULL,
  `ets_email` varchar(64) default NULL,
  `ets_ref_trans_ids` int(11) default NULL,
  `ets_password_expire` varchar(20) default NULL,
  PRIMARY KEY  (`ets_transaction_id`),
  KEY `idx_reseller` (`ets_reseller_code`),
  KEY `idx_product` (`ets_pi_code`),
  KEY `idx_transdate` (`ets_transaction_date`),
  KEY `idx_type` (`ets_transaction_type`)

Main difference with the breach data is that it doesn't have ets_country, ets_state and a few other columns. But still, there's way too much to be coincidental, there's a common origin there somewhere.

The data is finally up in MySQL, let's check out the schema inspector:

A data breach investigation blow-by-blow

My eye is immediately drawn to the big one being the "emails" table with over 1.5M records so I check that out first:

A data breach investigation blow-by-blow

There's a few interesting things here:

  1. Lot of redundancy with the same email appearing multiple times over
  2. Password is different for the same email (i.e. rows 380, 381, 382)...
  3. ...but these almost certainly aren't all user passwords; there's too many good ones!
  4. The good passwords look system generated, but there are a heap of others which are clearly user generated
  5. The dates are very old - does this seriously go back as far as 14+ years?!
  6. The "ref" column is interesting, referrer from other sites, perhaps?

Some of these questions can be answered pretty emphatically, so I start querying the data:

  1. The earliest records are from 9 May, 2002
  2. The latest ones are from 31 Dec 2014 (that's a very wide range for a data breach)

I want to start actually reaching out to some of the people in this incident (which is always fun given the nature of the data...) so I start copying the email addresses I extracted earlier up into HIBP (this won't make them searchable on the site, it will merely give me the ability to query them). While that's running, I decide the "members" table is another particularly interesting one because it may actually start to point to individuals in the incident. Here's what I find in there:

A data breach investigation blow-by-blow

This gets more interesting because it looks like these folks have paid money for a service. We're seeing transaction numbers and payment processors here including iBill which Wikipedia describes as follows:

a top credit card transaction aggregator for adult entertainment websites

That article goes on to say that iBill was incorporated into another company and then changed names more than a decade ago, but a little further across in the set of columns beyond what I screen-capped above was another "day" column with dates as new as the last day of 2014. Some of these were processed by iBill. Turns out that iBill is a smaller player in all this though as I discover once I aggregate the processor column:

  1. epoch 199,414
  2. ccbill 40,684
  3. wts 26,250
  4. jettis 3,114
  5. ibill 2,536
  6. itrans 1,088
  7. electracash 356
  8. 2014charge 51
  9. psw 19
  10. wsb 18

Epoch has the lion's share of transactions and that's a potential avenue I could now go down to trace the source of the payments (I could always reach out to them). But at present, I'm still no closer to working out where the data actually came from though so I try a few queries:

select * from members where email like '%+%'

People sometimes use the "+" syntax within email address aliases to identify the site they're using, for example test+eroticy@example.com which would imply the source of the data. There's zero results though which could be because some websites block the "+" symbol (don't do this guys!) but also because in my analysis of other breaches, I find there's usually only about 0.03% of accounts using this syntax.

I try another query:

select password, count(*) from members group by password order by count(*) desc

What I'm trying to do here is see if there's a commonly used password that might indicate the source (other than the usual generically crap ones). Here's the 50 most popular and their respective counts:

  1. 123456 1,525
  2. password 815
  3. pussy 301
  4. dragon 296
  5. 12345678 282
  6. football 236
  7. fuckme 234
  8. 696969 222
  9. qwerty 202
  10. baseball 200
  11. 12345 199
  12. 1234 196
  13. shadow 159
  14. 111111 156
  15. master 156
  16. letmein 156
  17. superman 153
  18. abc123 152
  19. monkey 145
  20. mustang 141
  21. jordan 131
  22. jessica 119
  23. 1234567 119
  24. fuckyou 117
  25. Harley 116
  26. michael 116
  27. hunter 113
  28. buster 111
  29. thomas 110
  30. ranger 102
  31. killer 102
  32. FUCK 101
  33. jennifer 99
  34. junior 97
  35. andrew 94
  36. asshole 94
  37. 666666 94
  38. tigger 94
  39. joshua 93
  40. batman 92
  41. ashley 91
  42. freedom 91
  43. 123456789 91
  44. amanda 86
  45. soccer 86
  46. bigdog 85
  47. h00ters 84
  48. ginger 82
  49. sunshine 82
  50. bandit 82

These are just the usual crap ones though with a bias towards what you'd expect on a porn site. I'm not seeing anything here that jumps out and indicates a potential source. Seeing a password such as "1234" though indicates a really weak password criteria on wherever it came from which is often a bit of an indicator of age (we had more tolerance for bad passwords years ago).

The email addresses are now in HIBP (not searchable in there, rather just sitting somewhere I can privately query them), and I find that of my 934k verified subscribers, 512 of them are in the addresses I extracted. I email the 30 most recent subscribers, the oldest having signed up 6 weeks earlier:

Hi, I’m emailing you as someone who has recently subscribed to the service I run, “Have I been pwned?”

Your email address has appeared in a new data breach I’ve been handed and I’m after your support to help verify whether is legitimate or not. I’d like to be confident it’s not a fake before I load the data and people such as yourself receive notifications.

If you’re willing to assist, I’ll send you further information on the incident and include a small snippet of your (allegedly) breached record, enough for you to verify if it’s accurate. Is this something you’re willing to help with?

For verification, I’m on the about page of the site: https://haveibeenpwned.com/About

It'll take a while for people to start replying so I keep digging. The "payouts" table seems like an interesting name for an adult website:

A data breach investigation blow-by-blow

Seeing a column called "revshare" is making me think about the affiliate situation again as is the table name itself. Is this one of those deals where someone gets cash for enticing others to signup to a service? A "payout" based on "revenue share", perhaps?

A few responses from HIBP subscribers come in and I fire them back different variations of this:

This relates to a data breach which has allegedly come from the Adult website known as “Eroticy”. However, I don’t believe that’s the actual source of the incident due to various indicators in the data itself. I’m hoping HIBP subscribers can help me work out the actual source. Your email address is in the data, here’s what I can tell you about it, I’d appreciate your feedback on the accuracy of the data:

1.    Your record says it was created on [redacted], however the data may be a decade older
2.    There’s a password next to it of “[redacted]” (obviously I’ve obfuscated some characters here)
3.    There is a “ref” column which I believe is for “referrer” and it has this value in it: [redacted] (the URL no longer has an active site on it)
4.    There is an IP address of [redacted] which puts it in New York City: https://db-ip.com/[redacted]
5.    There’s a username field of “[redacted]”
6.    There’s a name field with “[redacted]” in it

I know this is possibly more than a decade old, but does any of this look familiar? My suspicion is that a lot of the data could relate to affiliate programs within the adult entertainment space, if there’s anything you can share that might help me track down the source of this, it would be most appreciated.

One of the early responses is from a female which is pretty unusual as far as adult websites go. Her "ref" column has the site gangbangedgirls.com in it and I hesitate - momentarily - before sending her the info. What if she didn't actually sign up to it herself and I'm sending her what then appears to be an unsolicited link to a hardcore porn site? I figure I can always send her a screen cap of her record later on to clear myself if need be but admittedly, I did worry about how she'd react.

The responses from subscribers start coming in:

I think I signed up to that when I was researching how to develop an adult website. All that info is correct.

That's useful, he then continues:

I would call this 100% valid. I'd say it was maybe 7-9 years ago.

But when I pushed him on whether he recalled the name "Eroticy", he had no recollection of it although he did say that the "page design looks familiar".

Another subscriber chimed in:

That is a long time ago. Password sounds like something I would have used way back.

But then went on to say that he doesn't know what it would have been associated with.

The woman I mentioned earlier also responded, fortunately without any signs of me having offended her but echoing the previous response in terms of being unsure where the data would have come from:

This is not something I would have subscribed to at that time. the password, however is familiar.

And then more reassurance came through:

Ya that seems like something I probably made an account for back in the day when I was a kid. The password was definitely a BS one I used back then too.

Yet more confirmation of the legitimacy of the data itself came through:

That is - indeed - a non-secure password I used to use, and a username I've used - mostly on dating sites and other anon-ish environments where I don't want my more usual username to be immediately google-able. Some of the dating sites I was researching (honestly: I was paid to do it!) were almost certainly related to the adult entertainment biz. Also, I had an account on Xbiz a long time ago that almost certainly had that combination.

This is all great, but there's nothing in there that's helping me work out what's actually happened. And then this reply came in:

Yeah used to be an adult webmaster so there would be some of my data out there

So this isn't someone who's necessarily a customer, rather someone who's been involved in running these sites before. Further supporting that is I can't see him in either the "members" or "emails" tables like everyone else, he's actually in one called "webmasters" which had failed to import into MySQL. There's quite a bit of data in there about him and in fact it turns out he lives not too far from me. I send him all his data which he checks out and makes an observation very similar to my own:

That looks to be webmaster affiliate program table

Frankly, by now I'm starting to find myself at a bit of a loose end so I send him a link to this blog post in draft too. He provides some interesting feedback:

Epoch was THE processor. They had a couple of linux boxes sitting under a desk which were running the majority of transactions in the industry. Where you end up with multiple card processes is when the VISA rules start fining for chargeback ratios and processors start scrubbing cards aggressively. It was discovered that you could increase sales 25% by having multiple redundant processors in a chain, if one fails to accept the card you fall back to the next.

This is all very interesting in terms of the mechanics of how these sites work, but it's not getting me any closer to the source. (Interestingly, I've just finished reading Brian Kreb's Spam Nation and the whole premise of chargebacks is one the underground pharma industry had big issues with too.)

I decide the next step is to simply approach Eroticy about it. No, I'm not confident the data is from them but it's got their name on it and they're represented as having been hacked on other breach info websites so they've got a vested interest in investigating it. I plug this into their contact form:

Hi, my name is Troy Hunt, I'm an independent security researcher and you can find me at https://www.troyhunt.com

Recently someone sent me data which was allegedly hacked from the Eroticy website. I'd like to draw this to your attention and provide you with information to help verify if indeed you've had a security breach. Could someone in a security or technical capacity please get in touch with me and I'll share as much as I can.

And that was successfully sent off:

A data breach investigation blow-by-blow

And successfully received:

A data breach investigation blow-by-blow

So now it's just a matter of waiting because short of their response, there's really nothing more I can do at this stage.

Except they didn't respond. I sent that off on the 2nd of Jan and 8 days later as I write this, there's nothing. Nothing in the inbox, nothing in the junk mail, nothing at all. In a way, this is kind of the end of the road in that there's not much more I can do. Yet on the other hand, what everything up until this point has demonstrated is that almost 1.4M people have their data floating around out there - their legit data - albeit with uncertainty as to the source.

Short of investing copious amounts of further time trawling through the data, there's one more avenue available here and it's a last resort, but it's worked in the past. I'm going to publish the data to HIBP and flag it as both "sensitive" (so it can't be publicly searched) and "unverified". I'm going to put Eroticy's name on it but clearly explain in the description of the breach that I couldn't verify them as the source, albeit that the data itself is accurate. Then I'll link through to this blog post so that people can get the whole story.

Here's what I'm hoping to achieve from this:

  1. People who find themselves in the data will be aware their info is circulating. Even without being able to confidently identify the source, this still gets those people thinking about where they're reusing passwords, how much info they're sharing publicly and other general measures they should be taking to protect themselves (i.e. not using their real email address on adult websites).
  2. This will be brought to Eroticy's attention and I'm more likely to get a response. At this stage, I can neither prove nor disprove that the data came from them. However, it's their name on the breach and you've got various data breach websites reporting that they've had an incident so it's in their best interests to take a position on this.
  3. By virtue of sharing as much information as I have here (yet obviously protecting the identities of those involved), I'm hoping that someone pops up and properly identifies the source of this. Maybe it'll be someone who worked on the system, ran an affiliate or even the person who originally bundled it all together and called it the Eroticy data breach.

I took a similar approach with Regpack a few months ago. The publicity the post garnered resulted in it taking all of a day and a half before they properly investigated and admitted that they'd been the source of the breach. I'm not sure if we'll see that in this case or not, but it will certainly get more eyeballs on the issue.

As of now, the data is searchable in HIBP but again, it's flagged as "sensitive". You cannot search for an email address via the public interface and get a hit on Eroticy, instead you need to use the (free) notification service which will send a verification email to the address and ensure the information is only visible to the owner of the address.

This is a long post that if nothing else, I hope demonstrates how abstract many of these data breaches can be and how much effort can go into properly verifying them. This was done over a period of more than 3 months in total, in part due to waiting on responses from people and in part due to having to fit a fairly arduous process into a busy schedule. If you've got ideas on the source, please leave your comments below and I'll add any noteworthy updates to the bottom of the post if and when they occur.

Read the whole story
2747 days ago
Share this story

I'm With Her

6 Comments and 23 Shares
We can do this.
Read the whole story
2814 days ago
Do not feed the trolls
2812 days ago
Share this story
5 public comments
2814 days ago
I unabashedly love this. I love it. Thank you, Randall.
Greater Bostonia
2814 days ago
New York, NY
2814 days ago
2815 days ago
I didn't know Randall is pro-War. Saddening, actually.
2815 days ago
No, it's saddening that both parties put pro-war candidates on the ballot. I'm voting Johnson, but if you insist in propping up the two-party duopoly, then Clinton is the least bad choice.
2814 days ago
Politics aside, I'm surprised to see him do politics. These are extraordinary times.
2814 days ago
It was cool when she said, "I would bomb the shit out of 'em. I would just bomb those suckers. That's right. I'd blow up the pipes. ... I'd blow up every single inch. There would be nothing left."
2814 days ago
You surely don't actually believe that Hillary is pro-war and Trump is pro-peace, right? So are you disappointed that he's not supporting Stein or Johnson, or what? (And stevetursi: I have given up on people this election figuring out sarcasm, so for others in this thread, he's quoting Trump.)
2814 days ago
@wreichard Yes. This is an artist truly putting his money where his mouth is. Even though I disagree with him (and am not a fan of interrupting comedy with a serious political PSA) I still respect his courage in speaking out in a truly extraordinary political cycle.
2814 days ago
As a German, I couldn't care less which of the two fascist idiots will start the next war. I'm stunned about the weird American politics, suggesting that one of those two fascists "must" be the next President. That's all. But yes, if I had the right to vote in the U.S., I'd surely vote for Stein.
2814 days ago
That's admirable, and since I'm not a swing state voter I can do something like that. I'm not a fan of Clinton, but my priority right now is to send Trump back to the shit-infested hellhole he came from, and if that means making her president for the next four years, so be it.
2814 days ago
It's not about being pro-War, it's about being anti-Trump. An actual, literal fascist augmented by dangerous libertarian rhetoric.
2814 days ago
Being "anti-Trump" is a bad excuse to vote for the war.
2814 days ago
No, it isn't. It's not an excuse, it's a logical choice. Trump is demonstrably unstable and his election will rally all those who listen to his dog-whistles as being acceptable. There is literally no reason to vote for Trump unless you are a misinformed, (un)willingly bigoted, culturally- and economically-protected white person. And the latter goes for Johnson, as your votes will do nothing but pat your own back at the expense of those who would be killed under a Trump regime.
2814 days ago
At least Trump was not an alleged (?!) part of a certain child sex ring. -- What makes you think President Trump would be responsible for deaths and President Clinton would not?
2814 days ago
You are German. If you do not understand the difference between Trump, who is stoking xenophobia and racism as a scapegoat for a working class that feels disenfranchised and is moved to violence, and Hillary, all I can tell you is your knowledge of your own post-WWI history is sadly lacking. (And Trump *is* actually accused of raping a 13-year-old, so you appear to have your child sex stories backwards.)
2814 days ago
Point of correction: Trump is not libertarian, and neither is his rhetoric. He's more closely resembles an authoritarian, which is the opposite of libertarian.
2814 days ago
Trump is literally going to pretrial in December for child rape. There is nothing substantial to the Clinton version. Trump is loudly and enthusiastically fanning the violent flames of racism. Disenfranchised Americans will die.
2814 days ago
I think at this point it's clear that our German friend is demonstrably ill-informed and would be advised to have him do his homework before continuing to engage.
2814 days ago
@stevetursi Trump's libertarian enough to get Peter Thiel's endorsement. And I was just pondering if he was actually a Troll For Hire.
2814 days ago
Clinton already showed her will to lie in order to justify a war when she was a Minister. Trump never was a Minister. Where and when exactly did Trump suggest to send military anywhere to solve a problem?
2814 days ago
I get it, you're Clinton fans. Because she's a woman or something. Child rape, lies and lust for military intervention are only bad when you're Trump. Yes, I fail to see the logic here - and that's not because I'm German.
2814 days ago
Nordic: Having Thiel's backing does not a libertarian make. Cthu: Again, suggest you do your homework. Answers to all your questions are easy to find.
2814 days ago
Steve: My point is that he has overlap. :)
2814 days ago
I wouldn't dispute that. It's called nuance. Everybody has a little nuance. (:
2814 days ago
And here goes my trust in the newsblur comment sections.
2814 days ago
Just came here to say this is all crap. Thanks, Randall, for ruining one of the few last bastions of apolitics.
2814 days ago
This is saddening. I too consider Him to be so contemptible that even Her is the lesser evil. But this is not a place for electioneering.
2814 days ago
Yes, how very dare Randall express his opinion in his comic, that he draws, and you read for free. What a monster he is. Allow me to clutch my pearls in shock.
2814 days ago
"suggesting that one of those two fascists "must" be the next President. That's all. But yes, if I had the right to vote in the U.S., I'd surely vote for Stein." You'd do well to study the electoral college a bit. The US system pretty much enforces a two-party race for President because any prospective candidate _must_ get more than half the electoral college votes. There are 538 of them, so the winner must get 270. Third parties, when they get enough support to be meaningful, siphon off votes from the party they most closely resemble. Not everyone who votes for Clinton this year will be a "fan" but we recognize the mathematics of the electoral process. One of those two people WILL be president next January. Jill Stein will not. Given that choice, I'm With Her, just like Randall
2815 days ago
We can do this.

Dilbert by Scott Adams - 27 May 2015

1 Share
Read the whole story
3345 days ago
Share this story

Classic: Just How Many People Does it Take to Stop a Rogue Concrete Buffer?

1 Share

Submitted by: (via Spencer Laboda)

Read the whole story
3689 days ago
Share this story


13 Comments and 28 Shares
This image stays roughly in sync with the day (assuming the Earth continues spinning). Shortcut: xkcd.com/now
Read the whole story
3800 days ago
Share this story
11 public comments
3795 days ago
Pretty cool.
3797 days ago
I want a wall-sized real-life version of this where the time circle rotates and I can put a dot on my current location. Also, a puppy.
Saint Paul, MN, USA
3796 days ago
I have a clock like that.
3800 days ago
all these comments and no criticism of the projection used?
Earth, Sol system, Western spiral arm
3800 days ago
I also thought about that...
3800 days ago
Why would there be? The Azimuthal Equidistant Projection is perfect for displaying approximate timezones. I guess some people might have a problem with it being centered on the South Pole, but they are just showing a Northern Hemisphere bias.
3800 days ago
On the countrary, being centered in the South Pole is a Nothern Hemisphere bias. It shows Europe bigger than South America and as big as Africa... Intersting insight about projections in The West Wing: http://www.youtube.com/watch?v=vVX-PrBRtTY
3800 days ago
It also shows them ridiculously distorted in a manner unusual to those familiar with the typical cylindrical and pseudo-cylindrical projections. Size isn't everything. What is easier to pick out? Uruguay or Germany?
3798 days ago
This is the TIME CUBE
3798 days ago
Bottom-up is always good....
3800 days ago
3800 days ago
Atlanta, GA
3800 days ago
Alt text: " This image stays roughly in sync with the day (assuming the Earth continues spinning). Shortcut: xkcd.com/now"
3800 days ago
There should be an award for great ideas like this!
3800 days ago
:) Ausdrucken
3800 days ago
Kann ich Deinen ePaper Drucker haben ;-) ...oder ich klebe es an den Stundenzeiger. Cool, jetzt bekomme ich einen Award!
3800 days ago
Да, мощно.
Barcelona, Catalonia, Spain
3800 days ago
3800 days ago
This post appears right when I'm still at work after midnight. x_x
Colorado Plateau
3800 days ago
3800 days ago
Why is there not an app for this that looks like this?
Denver, CO
3800 days ago
It's called a "globe."

With great Azure VM comes great responsibility (which is why you really want an Azure Web Site)


I’ve had a recurring discussion with a number of well-meaning people (WMPs) recently which has gone kind of like this:

WMP: We’re going to build you a web site and we’re going to use Azure.

Me: Awesome! So you’d use an Azure Web Site service then?

WMP: No, even better, we’re going to use an Azure VM!

Me: Ok, so why’s that?

WMP: Because it gives us more power and control. Power and control are good.

Me: Righto, so what are you going to do with that power and control that you can’t do with a web site?

WMP: Because it gives us more power can control!

Me: Sorry, maybe you misunderstood – what’s the benefit you gain in having an operating system at your disposal?

WMP: Because, uh, reasons…

In short, they want a VM because that’s just what you do when you stand up a web site – you get the biggest damn thing you have the most control over as possible and crank it up to 11 from day one. You can then RDP in and move files around and look at event logs and do a whole bunch of stuff that you think you want to do, but that you really don’t.

In near on every single case I see, a virtual machine is exactly what you don’t want. It’s not just as I imply from the title insofar as great power giving you great responsibility (although I will come back to that), in factthere are many, many reasons why a web site should be your default preference and in fact you lose many, many things by having a VM. Let me walk you through it.

How to: Deploying a web site on a VM

Let’s start with the first problem – see if you can spot the virtual machine running IIS in the wizard for adding an Azure resource:

Azure wizard

See it? No, of course you don’t because it doesn’t exist. You’re on your own here which means that you’re going to be building up an instance of IIS all on by yourself. You’d better know what you’re doing on that front too; Microsoft’s web server has plenty of nuances in how it should be optimally configured and if you’re the guy building the web site, chances are you aren’t the guy who deeply understands IIS.

But let’s persevere with this anyway so I’ll create a small VM called, oh I don’t know, let’s say “YouDontNeedAVM” and I’ll use a Windows Server 2008 R2 SP1 image. And now we wait. And wait. And wait. In the scheme of creating an entire operating system, somewhere in the order of 10 minutes is actually very good but an Azure Web Site is near instantaneous. Of course this isn’t really an issue in the scheme of a resource that will be around for years to come, but it’s reflective of the scale of the resources you’re creating here – you’re getting a whole freaking operating system for that one little web site!

So machine created, now we can remote in:

RDP'ing into the VM

Ah, feel the power! I’m revelling in the control I have over this machine that is there purely to do my bidding, let’s start exercising that control and creating all the things you need to make the web site work. The first and most crucial thing is IIS and you are 100% responsible for installing, configuring and managing it so you’d better know what you’re doing and be competent in an infrastructure operations role.

Let’s proceed anyway and I’ll begin by adding a server role which will be IIS:

Adding an IIS role

Moving on all we need to do now is select from a simple list of options:

The options list for installing IIS

Of course what you see here is wrong, at least for your needs. You’re probably going to want ASP.NET support but then you really don’t want directory browsing and dynamic content compression is always nice but then you’re missing the management service for remote admin (because you’re hardcore and really just want to do everything over RDP) and so on and so forth. The point I’m trying to make is that there are a lot of moving parts in IIS and you want to be very confident that you know how to manage them all.

Let’s just run with the defaults for now and that gets IIS up and running in a basic configuration. The next thing you’ll need is an endpoint so that people can actually access it over port 80. You see by default, a VM can only be accessed via PowerShell or RDP which is why there are only two endpoints configured in the management portal:

Default endpoints for a VM

Let’s add an endpoint:

Add an endpoint wizard

End of course we want to enable HTTP over port 80:

Specifying port 80

And eventually…

Default IIS website running on the VM

It’s alive! Now for all the things that are already wrong with this installation: You can’t FTP to it because we have to install that service (although we could have selected it from above) but then you still can’t FTP to it because there’s no endpoint configured for port 21 so you need to configure that but then you shouldn’t for FTP’ing in the clear so you really need port 22 for SFTP but of course you really shouldn’t be deploying your web site by FTP anyway. Ok, so you can configure Web Deploy which is the correct approach then you also need to add an endpoint for port 8172. Oh – if you want SSL you’ll need an endpoint for 443 as well plus you’ll be doing all the certificate installation and configuration by hand as well because there’s no portal to automate it for you. Then you want to think about whether you really want everything just dumped on the C drive as space is limited and you typically want to think about dropping it on a data drive which you can do, but it means heading back to the Azure portal and creating a new disk after which you’ll need to head back into the VM (because you love RDP) and mount is appropriately then reconfigure IIS to point to a path on that disk. Don’t forget to get all those permissions spot on as well – too strict and nothing works, too loose and you’re going to have bigger problems than just Web Sites versus VMs.

To some extent, when you stand up a web site on an Azure VM, you become the hosting provider. What I mean by that is that it’s you that needs to take responsibility for the virtualised infrastructure. You need to be a proficient server administrator to get all of this right and you really do have to ask the question: Is this my expertise? Is it the best use of my time to add this responsibility? Almost always, no, it’s not but we’re only just getting warmed up, let’s talk about patches.

Because everybody loves managing server patches, right?

When you take a new VM from the wizard I showed earlier, you get a fully patched and up to date image which is nice. One month from now it is no longer nice – it’s unpatched and vulnerable as another Patch Tuesday has passed. Of course it may be that the second Tuesday of the month has passed without much to talk about, but it may also be that immediately after you have Exploit Wednesday.

Having a VM is like having a kid – you can’t just stand one up and forget about it. Actually it’s not like having a kid because kids grow up and become responsible for themselves but you’ll always be patching and maintaining your VM and that’s an activity you’ll have to actively perform at least once every month. This is not one of those “yeah but it’s academic and not that important” issues, this is one of those “do it or expect serious problems” issues.

Let me give you two cases in point: End of the year 2011 and Hash DoS hits ASP.NET apps. This was very nasty, easily exploitable and Microsoft quickly released critical patch MS-100 out of band of the usual Patch Tuesday schedule. In fact they released it on December 29 which means you, dear server administrator (because that’s what you are now), needed to take time out of your Xmas holidays and patch vulnerable machines which would include any Azure VMs.

Another similar case was the Padding Oracle exploit in 2010. This was a very similar story insofar as it was a critical risk within the ASP.NET framework that was readily exploitable and quickly patched – out of band again. This is what all that power some people lust after in their own VM gets you – a huge amount of responsibility. And these are only a couple of ASP.NET examples, there’s all the various RDP patches and others that plugged the sort of serious holes you expect to find in a large scale operating system over time.

Of course the other problem this gets you is how are you going to test these patches? Remember, when you have your own bespoke server configuration that inevitably begins to deviate from the off-the-shelf installation that Microsoft gave you, you’re increasing the risk of future updates going wrong. In all of the cases I know of where you have professional teams managing infrastructure, they’re installing patches in dedicated testing environments, making sure things play nice then pushing them out. In other words, are you happy to maintain a mirror copy of the VM just to make sure patches aren’t breaking your things? Or is this your testing strategy:

I don't always test my code, but when I do I do it in production

I don’t want to berate Azure VMs per se because they’re excellent for certain purposes. For example, I made great use of one for some very short term resource intensive data processing recently. I also use one for the import of data breaches on my “Have I been pwned?” project (HIBP) and it makes enormous sense in both these cases. They’re also both things I simply can’t do any other way and in both cases I start the VM, get in there and do what I need to do then get out again. I certainly don’t want the responsibility of maintaining a VM environment the public hits 24x7.

Now you may wonder – “Don’t I still need to test my web apps deployed to an Azure Web Site after server patches anyway?” – and that’s a fair question. Let me draw on an old cloud paradigm analogy to try and explain this: we all have access to electricity which we readily consume by plugging our things into the wall. The electricity is the service and we get that from the likes of Energex in Australia or other providers in other locations. How they deliver that service to my wall socket is not my concern so long as I get a constant flow of the stuff (tree-hugging hippies may disagree but you can see my point). Their power stations will undergo maintenance and their wires will be replaced and all sorts of bits required to deliver the service will change, but the service itself remains the same.

When you have an Azure Web Site, you have a service guarantee from Microsoft that your little slice of their infrastructure will remain stable and indeed they have SLAs in place to very clearly set expectations. You don’t need to proverbially test that you kettle still works when Energex updates facets of their power station operations any more than you need to test that your web sites still work when Microsoft applies updates to the underlying infrastructure. New features will be added to Azure as the platform evolves and inevitably we’ll see support for things like new language versions, but they’re not about to say, yoink PHP support or break your auth implementation and if that did happen on the Azure Web Site platform, there’s going to be a whole heap of other people jumping up and down very quickly because they’re all sitting on exactly the same service. When your mind is back in the old paradigm of managing your own server and being responsible for everything as you are with an Azure VM, this can be difficult to get your head around but it shouldn’t be – it’s actually all very simple now in that it just works!

Incidentally, there are facilities within the Azure scale out model (which I’ll talk about shortly) which provide high-availability options by deploying assets across multiple “Update Domains” such that when the underlying service is updated it happens domain by domain and load can be shuffled from one to the other without the service needing to go down during updates. Have a read of this piece by Mark Russinovich (emphasis mine):

A key attribute of Windows Azure is its PaaS scale-out compute model. When you use one of the stateless virtual machine types in your Cloud Service, whether Web or Worker, you can easily scale-up and scale-down the role just by updating the instance count of the role in your Cloud Service’s configuration. The FC does all the work automatically to create new virtual machines when you scale out and to shut down virtual machines and remove when you scale down.

What makes Windows Azure’s scale-out model unique, though, is the fact that it makes high-availability a core part of the model. The FC defines a concept called Update Domains (UDs) that it uses to ensure a role is available throughout planned updates that cause instances to restart, whether they are updates to the role applied by the owner of the Cloud Service, like a role code update, or updates to the host that involve a server reboot, like a host OS update. The FC’s guarantee is that no planned update will cause instances from different UDs to be offline at the same time.

Mark talks about PaaS – Platform as a Server (such as Azure Web Sites) – which is fundamentally different to IaaS – Infrastructure as a Service (such as a dedicated Azure VM). When you’re getting a platform such as an IIS web site, there are all sorts of tricks that can be done underneath such as the one quoted above. When all you have is a single virtual machine, many of these options are completely off the cards.

A VM is not just for Xmas

When you create a new standalone VM as we did earlier on, you have a state of the art, entirely modern operating system. When you create a new Azure Web Site, you have a state of the art, entirely modern web hosting platform.

Fast forward three years and what do you have? If it’s an Azure Web Site, well, you still have a state of the art, modern web hosting platform that has had three years’ worth of new features and automatic upgrades from Microsoft. The platform – the service – has evolved over time and you’ve benefitted from all the new shiny bits.

However, if it’s an Azure VM you manage yourself then you still have exactly what you created three years ago except now you’re on a 5 year old OS (this is the 2012 image, of course, albeit the R2 edition). Any new features or fancy bits will only be the things that you have added yourself as part of a general server maintenance and upgrade regime (which, of course, you’ve carefully tested in a non-production environment). It will also be at least one generation out of date, maybe two given the rate at which operating systems are revving these days.

Perhaps three years is too short – maybe think along the lines of 5 and in this era with the rate of change of modern software, that’s a lot. At some point you’re going to have to move off that platform which, if you really are wedded to the idea of managing your own OS, means installing an all new one and setting it up from scratch. Again. Trust me, I know how little people think about events years in advance but I also know how debilitating it can be when you’re on an old OS on which you’ve created all these dependencies but just can’t find the time or justify the effort to move forward onto something modern.

The point I’m making is that using a dedicated VM and managing your own OS is a long-term affair, certainly compared to an Azure Web Site. The friction of change is high (again, that’s a relative term) and that VM is going to be your very own needy child until you get to the point where you’re ready to retire it.

You risk creating machine dependencies

Let’s move onto the web app itself and talk about creating dependencies on the machine. I see this one happen all the time and it’s by no means specific to Azure’s offerings. Let me give you the most classic, common and scary scenario – Microsoft Office. Yes, it’s true, some people really do create web apps that simply won’t work unless Microsoft Office is installed. I’ve seen all sorts of nastiness related to COM dependencies and Office prompts appearing on unintended server screens and it’s just a whole world of pain. People create web apps with dependencies on Office “because they can”.

One of the big problems this creates is that the app is no longer transportable; you can’t just pick it up and whack it on any old IIS web site. Now you may say “but I don’t want to just whack it on any old IIS web site” and that may be true – today. The problem, however, is when you have an entire machine at your disposal the temptation is often to do things that by any measure – VM or no VM – are very poor coding practices. Managing your own VM give you the ability to do this.

Of course what you really want is a bin-deployable solution like, say OpenXML if what you really wanted to do with Office on the server is just create Excel documents. Or maybe iTextSharp if you were installing that PDF app locally just to generate some docs. There are usually free tools that are publishable with the app and run anywhere. If you’re not using libraries like this, take a good hard look at your app because there’s a good chance you’re doing it wrong.

Of course if you’re not creating these dependencies, then you don’t need a VM for installing things at will now, do you?

There’s no management portal for web sites on a self-managed VM

There are many, many things you get with the Azure Web Site portal that otherwise make life hard when you just run IIS in a VM. For example, how are you going to manage your connection strings? Please don’t tell me you’re just going to whack your plain text credentials into source control! Sensitive data doesn’t belong in source control, it’s up to the build and release process to manage things like connections strings or API keys or anything else you want to keep private.

Now of course you can still do this with a VM, just use a build server like TeamCity and parameterise the builds to apply the correct settings. Or you can just do this in an Azure Web Site:

Managing connection strings in an Azure Web Site

On deployment, Azure will automatically transform each of the connection strings in my HIBP web site and apply the correct server name, username and password. I don’t need to store them in source control and indeed the ones in my web.config point to local instances of each of these and they’re just automatically transformed on release. You get that for free in an Azure Web Site.

Actually, the whole connection string management thing becomes even easier when you have a linked resource such as a database:

Database linked resource

This creates an association between the two and automatically created the connection string you see above. Neat.

We’re only just getting started, what about things like managing all your domain names for the site:

Managing domain names

Sure, you can do that inside IIS and manage your own bindings manually, but nothing is easier than just doing it via the browser.

Or how about managing your site and app diagnostics all via the browser:

Site diagnostic settings in the portalApplication diagnostic settings in the portal

Again, it’s doable by RDP’ing into the machine but it’s a whole lot easier via the Management Portal where everything is in one place.

That’s just scratching the surface though, let’s move into the powerful stuff.

You’re probably not going to be deploying from GitHub

In the beginning, we deployed web sites by FTP and all was good. Except when it wasn’t. For example, parts of the web site were deployed but other parts that the deployed parts needed weren’t there. FTP was a great enabler of fragmented deploys and broken sites because let’s face it, it just doesn’t make sense to copy up every single file from an entire project on every release.

So we got Web Deploy and if you weren’t using this to publish your ASP.NET web sites, you were deploying it wrong. This gave us an assurance that every time a web site was deployed, the whole thing was neatly bundled up, compressed, transported then published into the site. Extraneous files on the target could also be removed to ensure everything actually synced up. You could then deploy the whole lot from Visual Studio which was fine, but as per the article in the link above, what you really want is automated deployments from source control. In Azure Web Sites, you get this for free.

Here’s what you get on your dashboard (the dashboard is the thing you don’t get for web sites running on an Azure VM you manage yourself):

Dashboard link to deploy from source control

So what does deploying from source control mean? It means any one of these things:

Different sources you can deploy from

Off the screen is Bitbucket, CodePlex and “External repository”. Hey, is that “Dropbox” in the screen above?! Yep, you drop files into your synced Dropbox folder on your PC, magic happens then they are live on your web site moments later. But the one that will be most relevant to the most people is GitHub and when you select that option you’ll be asked to authenticate to their service after which your repositories will be neatly listed for you to choose from:

List of GitHub repositories

You can also enter the branch you’d like to deploy from so if you want to keep pushing stuff to master but don’t want it going live every time, you can, say, create a “deploy” branch and just merge into there each time you want to release.

Now, why is this important and why is it awesome? Because it means that we can go into the deployments section of the web site dashboard at any time and see this:

Deployment history for the website

If I click on the last deployment (the one at the top), I can see the steps taken in order to execute that deployment:

Steps in the deployment process

I can even view the log for the deployment command which will show me the entire output for the process:

Deployment output

But let’s stop for a moment and talk about why this is awesome before looking at more nuts and bolts. When an Azure Web Site deploys from a source control system, the process is pulling down just the files it needs from source control – it’s an update. The files it needs will be those that have changed since the process last ran because it has a copy of the last version locally. It’s no different to working in a team and pulling changed from a central repository. What all this means is that it’s extremely fast; commit to source control, Azure sees the change, pulls down just what it needs to then builds and deploys it to the web site running in the same data centre. The long and the short of it is that I can push a change from my local machine and have it live on the web site in less than 60 seconds. But wait, there’s more…

The advantages fast deployments give you are numerous: you can release features at a much faster rate, you can releases fixes at a much faster rate and what’s more, you have the assurance that the release process is automatable and repeatable plus you can be confident that you have all the code required for the release securely under source control. But the other major thing you can do is roll back a change.

A few images up you can see three deployments and the middle one has a green tick and the words “Active deployment” next to it. Chronologically, this deployment originally came before the one at the top yet it’s presently the active one. Why? Because I screwed up. When I released the change titled “Implemented caching on list of breaches” I fairly comprehensively broke the site. Now I could have gone off and tried to figure out what the hell I’d done, implemented a fix, tested it then pushed the change but what I did instead was this:

Redploying an old build

I simply redeployed the last deployment. This is enormously powerful as it enables you to very quickly save the site from yourself. Because it’s just sitting there as an action in the management portal I can be out and about, realise I’ve screwed up then simply whip out the iPhone and push the old version.

You do not get this in a VM. Of course you can always go and set up your own build server and configure everything manually but it’s no small task (there’s a reason I broke my TeamCity guide into 5 separate parts). You can also go and grab all the Kudu bits and try to recreate this yourself if you really want to but again, it’s not simple, at least not compared to the process above.

If you don’t use the automated deployment process within Azure Web Sites, ask yourself this: How easy is it to deploy? How long does it take? Do you have absolute confidence all the code is in source control? Can anyone with the appropriate rights easily and reliably reproduce the release process? What’s your rollback strategy?

You can’t scale out

You know how you get “infinite scale” in the cloud? Well that’s the promise anyway even though it doesn’t always work that way. However, some mechanisms of scale work better than others and there are two primary means of scaling that you need to understand:

Scaling up: When you scale up, you add more resources to the instance you already have. In the olden days, this meant adding more RAM or dropping in another CPU – you still have one machine, but it’s got a lot more grunt. We can do that very easily with a VM and it looks just like this in the Management Portal:

Scaling up a VM

Choose a larger size, commit to the additional cost, let the machine reboot and take everyone offline then hey presto – more power! You can do exactly the same thing with a Standard Web Site too because underneath the mechanics of it, you’ve still just got a VM that Microsoft fully maintains for you. But there’s another trick you can do with Azure Web Sites…

Scaling out: When you scale out, you add more instance of the underlying architecture rather than just pumping up the power of what you already have. In a web world, this means requests then get load-balanced between instances thus lightening the load on each individual node (Azure Web Sites then use affinity cookies to make the sessions “sticky”).

So why is scaling out so awesome? Because you can autoscale:

Autoscale settings

You’re looking at a bunch of stuff here so let me walk you through the key points:

  1. This web site is running on a “Small” VM which is the entry level for a Standard Azure Web Site (there’s no “Tiny” yet like you can get on a standalone VM but word is it may be coming…)
  2. My scale settings do not have a scheduled time which means the next bits I’m about to describe can happen whenever they need to. I could say that only want to scale out at certain times of the day, week or even within a specific start and end date such as during end of month processing.
  3. My “Scale by metric” is CPU which means that autoscale is going to occur based on conditions related to the processor in the VM. I could set that to “None” and just manually specify the number of instances I’d like.
  4. The instances graph shows I’m currently using just one instance and have been for the last week.
  5. The instance count specifies a range of between one and three which means I’ll never have less than one VM (obviously) and importantly to my bottom line, I’ll never have more than 3.
  6. The target CPU is between 60 and 80 per cent which means that when the CPU goes over 80 across all the existing instances it gets another instance added then when it drops below 60 it gets one removed.

This, is awesome. I can be asleep in the middle of the night and the site ends up on the front page of every newspaper and my scale will triple without lifting a finger. I’ll pay for it, obviously, but my cost ceiling is limited as I’m not allowing more than 3 instances to be autoscaled. I can take that all the way up to 10 if I like and I can also scale the VM up all the way to a “Large” instance with 4 cores and 7GB of memory. Between scaling up and scaling out I can get significantly more power than a single VM which can only scale up and when those instances are added (or subtracted), nothing goes offline, the load just gets distributed across a different number of nodes. Think back to Mark’s comment earlier on about how scaling out can be used to keep services up during scheduled maintenance – make sense now?

This is a really critical point on the value proposition of the Azure Web Site model as it absolutely maximises your dollar to give you the most perf possible but not more than what you actually need. It’s an absolutely seamless implementation too and you just can’t get that in a standalone VM you manage yourself.

You don’t get staged publishing

This isn’t a feature I’m using myself yet, but just last week Scott Gu dropped staged publishing on us. Now this is pretty neat because what it means is that rather than just pushing all your things direct to production and hoping it works, you push it to a staging site and test it all works first. There’s nothing too new about that paradigm, but where it differs is that you can now just “flick a switch” and staging becomes live. This is just another one of those little things that helps you with the business of what you’re at Azure to do in the first place – manage your web site.

The bigger picture is that features like these keep flooding into Azure Web Sites on a very frequent release cycle. These are features designed solely around the purpose of making web sites awesome. Yes, there are new features being released all the time to make VMs awesome too, but that’s not necessarily going to help you out in managing your web site.

You can monitor the VM, but not the stuff that’s actually important to running a web site

One of the really great things about all the services in Azure is the monitoring. For example, I can pull data like this on the VM I use for importing breaches into HIBP:

VM monitoring data

That’s pretty neat and it’s all quite useful info about what’s happening on the machine itself (albeit it basic compared to what you’d get from performance counters). But it doesn’t tell me anything about the web sites running on the machine because that’s a very specific use case about a very specific platform. But, of course, Azure caters to the use case that is running a web site and it instruments it accordingly:

Per minute web site monitoring

The scale here is different (the last hour instead of the last day) as I want to make a very poignant point about something in a moment, the main thing here though is that this is information specifically about the web site. Each of those metrics (and a bunch of others I haven’t added to the graph) are specific to the web site code running on this service. Things like the number of requests are significantly more valuable than, say, the network I/O across an entire machine that could be doing all sorts of other things. That fact that you get this per site is also important; whack multiple sites on one VM you manage directly and you’ve got no idea which one is doing what, at least not from the Management Portal.

The other thing you can do is what I wrote about earlier this week in measuring all the things and that’s set alerts. For example, as I point out in that post I really want to know when the traffic ramps up so I simply configured the monitoring to shoot me an email when the requests get above a certain threshold. Again, I can do this because the Azure Web Site service is designed specifically to make life easier for people managing web sites! That why the granularity of the monitoring above is so important too – now that you can monitor individual web site activity by the minute, it’s dead easy to push a change and immediately see the impact on the perf of the system. That’s an extremely useful feature.

You pay significantly more for a VM than you do a web site

No really, the costs for hosting on a VM are through the roof, let me demonstrate.

Here’s the cost of a Standard Web Site on a small VM:

Standard Web Site costing $74.40 per month

And here’s the cost of a small standalone VM:

Small VM costing $66.96 per month

* Does not include IIS installation, setup of HTTP endpoints, provisioning of a web site, configuration of the site, service monitoring, automated alerts, auto-scale, build and deploy services, source control integration, portal based config transforms, automated staging site, scheduled jobs or staged publishing. You can go and set all that up manually on your own damn time.

The most expensive component of a software project is invariably us – the organic matter. People are expensive and it’s easy to lose site of the value of our time and the value of the sort of features we’re talking about above. These things cost money and I’m not just talking about $7.44 a month – factor all this in and include the manual patching and testing plus the downtime from reboots and put a price on that then consider the real cost.

If you’re simply comparing the dollar figures on the calculator above, you’re doing it wrong. The cost in terms of the effort required to achieve parity with the web site feature set, the risk involved in configuring it all yourself and the long term impact of the increased overhead of manually performing so many of the tasks that are baked into the web site paradigm doesn’t even begin to be reflected in the figures above. It’s massive.

Let me make one other really key point here because if I don’t, I’ll inevitably get called on it: When you create a Standard Web Site per the calculator above, you’re actually getting a VM. In fact as the description explains, you can now load that one VM up with 500 sites if you like. The fundamental difference to the “Windows Virtual Machine” option beneath it is that when you create a web site service you don’t manage the VM, Microsoft does. What you get is the best of both worlds – a guarantee of your own logical machine on which only your stuff runs within a reserved slice of infrastructure but without all the responsibility of managing it. How good is that?!


Now I’m not saying that you never ever need a VM to stand up a web site, far from it, but those occasions are few and far between and you want to be very certain you’re reaching that conclusion based on facts and not assumption. If you’re building a web site, a VM should be a last resort based on all the reasons above (plus many others I’m sure I’ve missed) and you need to consider whether the trade-off is worth it. For a good overview of some of the activities you do need a VM for, check out Readify’s Developing and Deploying Web Applications in Windows Azure whitepaper.

Ultimately though, everything you’ve read above boils down to this: Microsoft created Azure Web Sites to host web sites on Azure. They tailored the features and the manageability to support the way we work with modern sites today and they did a damn good job of it. If you think you want a VM so that you get more control, you’ve got it the wrong way around because you get actually get less control over the things that are important for managing web sites in the cloud today and you pay more for it. When you look holistically at what it really means to build and support a web app, you’ve never had it better than what you do today with an Azure Web Site.

Read the whole story
3833 days ago
Share this story
Next Page of Stories