Installation and configuration checklist for Microsoft SQL Server

Below is the checklist I’ve used over the years for installing and configuring Microsoft SQL Server. It is by no means complete and there are certainly scenarios where additional steps will be required. It is intended to be a starting point, enabling somebody to not only standardize SQL Server installations but also to make sure that no critical steps will be missed.

Hopefully others will find it useful as well.

Before SQL Server Installation

  • install and patch the OS (minimum 100 GB for system drive)
  • BIOS, firmware & driver updates
    • check power management settings
      • BIOS power management should be set to OS control or disabled
      • Windows Power Plan set to ‘High Performance’
  • check processor speed using Task Manager and CPU-Z utility (make sure CPU runs at max speed)
  • RAID 1 for OS, logs and backups (but each on a separate volume), RAID 10 for data files
  • format drives for data & logs to use 64 KB allocation unit size (ideal for SQL Server) – full format, not quick
  • test drive performance with CrystalDiskMark & SQLIO / save screenshots/results for future reference
  • configure anti-virus scan software (if any) to not scan any database related files such as MDF, NDF, LDF, BAK or TRN
  • create regular user/domain accounts for each SQL Server component – main SQL service, agent, reporting, analysis, integration (SQLServiceAccount, SQLAgentAccount and so on)
    • for a dev server: probably ok to use just one SQL user account for all SQL services
    • use gpedit.msc to grant appropriate rights to the user for SQL Server Service (Computer Configuration -> Windows Settings -> Security Settings -> Local Policies -> User Rights Management)
      • Perform volume maintenance tasks
      • Lock pages in memory
        • if this is done then set sp_configure ‘max server memory’ to make sure the OS is left with enough memory to function
        • might want to not enable this on a dev server with shared duties (such as a webserver) so we don’t end up in the scenario where SQL Server will starve out the other processes for memory
  • prepare directories (on separate drives): SQLData, SQLLogs, SQLTempDB, SQLBackups
    • folder permissions
      • the install process will properly assign folder permissions to custom folders if they’re all specified initially in the SQL data directories configuration
      • remove “Everyone” (if set) from non-C drives where SQL-related folders will exist
      • for SQL Server main service user account: give full permissions to folder where SQL will be installed and the SQLData, SQLLogs, SQLTempDB, SQLBackups folders
      • for SQL Server agent user account: give full permissions to folder where SQL will be installed and SQLBackups

During SQL Server Installation

  • only install features that are really needed (especially on a production system) – don’t install something “just in case it might be needed it later”
  • use the proper user accounts (created above) for each service
    • set SQL Server Agent to auto start
    • most likely disable the SQL Server Browser service
  • on the Database Engine Configuration tab be sure to set the Data Directories to the custom ones we plan to use (not the defaults where everything is stored on the C drive)

After SQL Server Installation

Time to say goodbye to usernames and passwords for website authentication

We have to admit that our current model of using usernames and passwords for website authentication is utterly broken. It may have been fine back in the days of mainframes when users would have one account to remember but it just does not scale (safely at least) in the current day when the average user probably has tens (if not hundreds) of usernames and passwords to remember.

What do most users do in the face of the ever increasing number of websites that require authentication? They start reusing information – in particular passwords. We should all know that this is a bad idea. The best practice in this area is to have separate and strong passwords for each website account used … but how many people actually do that?

Before we put all the blame on users let’s not forget the websites themselves which are not doing a good job at all protecting all this authentication data. The best practice there would be to properly hash a user’s password in a way that even brute force attacks against that data would not be very successful … but we know that’s not happening. It seems that not a month goes by without some news story informing us that large numbers of user accounts from some website just got dumped online … and passwords were either poorly hashed or not hashed at all (simply stored in plaintext).

Other authentication models exist where some third-party is used to consolidate user accounts and allow users to logon to websites using credentials from that third-party (think OpenID, Facebook, Twitter, Google and so on). The problem with all these is the third-party: can it be trusted? will it protect our information and privacy? do we really want some corporate third-party knowing all the sites we visit simply because those sites rely on its authentication methods?

What we need is a modern authentication system that’s cryptographically secure, easy to use and preferably two-party (involving just the user and the website the user wants to visit).

Enter SQRL – Secure Quick Reliable Login (also see the Wikipedia page). The protocol typically uses a QR code, which provides authentication, where a user identifies anonymously rather than providing a user ID and password. SQRL is the brainchild of Steve Gibson – the host of the long-running and well-respected Security Now podcast. Steve has a lot of information about it at this location – https://www.grc.com/sqrl/sqrl.htm. He also recently did a presentation on SQRL at the DigiCert Security Summit 2014:

We need to start moving in this direction for website authentication because what we have right now is simply not working.

Relationship between AWS EBS snapshots and EBS volume failure rates

I’ve been doing some research lately on best practices that pertain to the correct use of AWS EBS volumes from the point of view of data redundancy – in particular around the relationship between combining EBS volumes in a software RAID volume at the OS level (either RAID 1 or RAID 10) and the proper use of EBS snapshots.

On the AWS EBS details page (http://aws.amazon.com/ebs/details/) in the section on “Amazon EBS Availability and Durability” we find the following:

Amazon EBS volumes are designed to be highly available and reliable. At no additional charge to you, Amazon EBS volume data is replicated across multiple servers in an Availability Zone to prevent the loss of data from the failure of any single component. For more details, see the Amazon EC2 and EBS Service Level Agreement.

The durability of your volume depends both on the size of your volume and the percentage of the data that has changed since your last snapshot. As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS Snapshot can expect an annual failure rate (AFR) of between 0.1% – 0.5%, where failure refers to a complete loss of the volume. This compares with commodity hard disks that typically fail with an AFR of around 4%, making EBS volumes 10 times more reliable than typical commodity disk drives.

The statement that really puzzled me for a while was the claim that EBS volumes are more durable (with a lower failure rate) when EBS snapshots for them are created more often. It just doesn’t appear that the two would be related … at least not when we think about drives and backups in a non-cloud, non-redundant way. I thought for a while that this was just some strange marketing statement made by AWS – almost like a reverse Murphy’s law: the more backups/snapshots you make the lower the chance that your drive will fail. I understand that having more frequent EBS snapshots of a volume would enable one to restore more recent versions of that volume’s data from the snapshot but how exactly could taking snapshots affect the rate at which a volume would physically fail?

Read more