Resources for Amazon Web Services (AWS) migrations from EC2-Classic to EC2-VPC

AWS EC2 (Elastic Compute Cloud) is one of the best known cloud services delivered by AWS. It is a service that allows customers to purchase resizable cloud hosting resources. It is – in my opinion – the best current implementation of IaaS (Infrastructure as a Service) from all cloud service providers. It truly delivers on the promise that within minutes you could spin up a new server instance and proceed with meaningful work without having to wait for days/weeks for a vendor (or IT department) to purchase and deliver properly configured hardware systems.

The original implementation of EC2 (which over time became known as EC2-Classic) had one interesting shortcoming that only became more and more apparent as customers continued to build solutions increasing in complexity: EC2 instances for ALL customers in a given AWS geographic region share private IP addresses in the 10.x.x.x space (technically there are some ranges from that Class A network that are not being used but that’s not relevant for this discussion). For example you could have in your account a web server with the IP address of 10.150.22.220 and another web server with the IP address of 10.20.100.70 (and you basically had no idea or control over who would be using the private IP address of 10.20.100.71).

AWS did provide the technology of security groups (basically software firewalls that wrap around instances) to allow customers to group together EC2 instances of similar functions (and not allow intruders access) but as customers were building more and more complex solutions it was getting harder and harder to manage the instances that one owned in AWS.

This model where one’s servers were spread all over the 10.x.x.x range was not the way that networking professionals were used to run networks in their own data centers.

In 2009 AWS introduced an improvement to the original EC2 approach – VPC (Virtual Private Cloud). In a VPC customers now had the ability to use their own private IP range, divide the network as they saw fit and pretty much come back to networking models that they were used to. There are many, many more features to AWS VPC that make it a clear winner over EC2-Classic but those are not the main focus of this article.

For a while then we had the two technologies side by side – EC2-Classic and EC2-VPC. Customers were able to create EC2 instances using either model. It was becoming clear though that EC2-VPC was the superior technology and AWS proved that on 2013-12-04 because after that date all new AWS accounts only supported EC2-VPC. In new accounts created after that date AWS automatically creates a default VPC and places all EC2 instances in that context.

Many of the older AWS customers were slowly faced with a dilemma: what were they supposed to do with their aging EC2-Classic instances? AWS is constantly innovating, adding new instances types and new features but many of those are now only available on the EC2-VPC side. If customers want to take advantage of the latest AWS features then they need to consider migration paths from EC2-Classic instances to EC2-VPC.

So what are the migration options available?

Read more

List of technology podcasts

I’m a full believer in the term “Automobile University” that was coined by the well-known motivational speaker Zig Ziglar – the idea being that time spent in traffic can and should be used to educate oneself on a variety of subjects. As such, I have quite a few technology / IT podcasts that I subscribe to and I make sure that my audio player always has plenty of interesting episodes available in the queue.

I already shared the list below with plenty of friends and co-workers who know that I listen to a variety of podcasts so I figured I should probably just create a blog post with this information for future reference.

At the time of this post all podcasts mentioned below appear to still be active – kudos and many thanks to all these authors who keep creating solid technical content for all of us to enjoy.

In the list below I link to the actual podcast sites. If you want to actually subscribe to them you should be able to find them in the iTunes store or wherever else you grab your podcasts from. If you know of any other ones in the various categories listed below please add them in the comments.

Cloud Computing

Big Data

Databases (SQL Server)

Development (.NET, Javascript)

Infrastructure, Networking, Enterprise Tech

Security

I should also mention the amazing list of shows / podcasts from the TWiT network. Leo Laporte and his amazing crew create some awesome content – way faster than I can possibly consume it. You’ll probably find some interesting topics there as well.

How to delete duplicate rows from a table in SQL Server

Here’s a question that probably most SQL Server DBAs or developers had to answer at some point in their career: how do you remove duplicate rows from a table? Maybe that duplicate data came in through an error in some ETL process or maybe it was an error in a front-end application that allowed end users to submit data multiple times. At any rate – the problem that caused duplicate data is now fixed and we have to cleanup the duplicates from our table. (This is also a perfect scenario / question for a SQL Server job interview because it will allow you to quickly tell how a candidate approaches SQL problems that are not exactly trivial – we’ll see why in a little bit.)

Let’s set up a table and insert some sample data into it:

Read more

Data caretaker and/or data interpreter – what exactly is the role of a DBA anyway?

There is a nagging thought that made its way to the back of my mind in the last few weeks. I’ve been watching or listening to a variety of technical presentations / podcasts – mostly on topics related to SQL Server – and after a while I started to see a pattern: the majority of topics and discussions were purely technical in nature and dealt directly with the SQL Server database engine itself … how to keep it running, how to make sure performance was good, how to take care of backups / high availability and so on. It seemed to me that not much thought was given to the actual data that was supposed to be passing through this database engine – the very reason for why we’re doing all these things we’re doing with SQL Server.

I got the impression that many DBAs or consultants who were involved in these discussions were more than happy to spend hours arguing the finer points of the new cardinality estimator in SQL Server 2014 (for example) but when it came time to actually try to make sense of the data passing through their hands they’d rather not have anything to do with it. To them data is just a black box – they’ll take care of it but they’re not interested in trying to make sense of it … that’s somebody else’s job.

So I started to wonder – what exactly is the role of a DBA anyway? Are DBAs strictly caretakers of data (making sure that it’s always there for use) or are they supposed to be much more than that – interpreters of data, helping the business make sense of all these bits and bytes that are getting collected all over the place?

Sometimes it seems to me that DBAs feel safe speaking the geek language of database engine specifications and features but they’re afraid to venture outside of that world where they might actually run into users trying to make sense of all that data. We don’t think that’s our job – that’s for application developers and business/data analysts.

Is that really the case?

I came across the article below (from a few years ago) where the author argues pretty much the same thing – that the DBA role should probably evolve to where DBAs get more and more involved in understanding the relevance of the data they’re managing:

Does the Role of the DBA Need to Evolve?

That article generated quite a few comments on the SQLServerCentral.com site – http://www.sqlservercentral.com/Forums/Topic1219839-263-1.aspx

What do you think of all this? Are DBAs strictly data caretakers or should they be more than that? Does it depend maybe on the size of the business/company where these DBAs work? Do DBAs really get to spend all their time having fun with technical aspects of the engine itself and not be bothered by users who need help understanding and fixing their data?