Debugging Long-Running SQL Queries

The T-SQL PRINT statement lets you write to the Messages window in SQL Server Management Studio, which makes it a helpful aid when debugging SQL scripts and stored procedures.

Unfortunately there is one major issue with using PRINT. The messages output are only displayed when the query completes, so if you have a long-running query then you won’t be able to see any of the output until the end.

Here’s a quick example:

Rubbish huh?

An alternative, and much better, approach is to use RAISERROR. This time the messages are printed immediately so you no longer have to wait for execution to complete to view them. Just be sure to use a non-fatal severity level – anything 10 or under is a good as it’s also unlikely to be picked up by any database monitoring systems.

As you can see, the logging syntax is slightly more verbose as RAISERROR takes a char/varchar but it’s a small price to pay for being able to see what’s going on as your queries run.

raiserror

Posted in SQL Server | 1 Comment

Automated Configuration Management: Deploying SQL Server via PowerShell DSC

Desired State Configuration (DSC) is a new PowerShell feature that manages Windows infrastructure (servers and applications), both on premises and in the cloud. It’s built into PowerShell 4.0 and is part of the Windows Management Framework. PowerShell DSC comes shipped with Windows 2012 R2 but is available to download for Windows 2008 R2 and Windows 2012.

DSC is declarative. DSC scripts have no logic on how to perform a particular install or uninstall. Instead they define how a particular server or application should be set up; it is then the responsibility of the runtime to “make it so”.

Imagine getting a house built, with walls and doors and windows. You wouldn’t worry about the composition of the walls, the number of hinges required to mount a door, or even how to install a window. That is the job of your builder – after all he is the expert. All you would do is describe how the house should look and then rely on your builder to make it so.

DSC works in the same way. The specification is contained in a configuration and the DSC runtime acts as the builder, making sure that resources on the server are set up just as we want them to be.

DSC is a relative newcomer to the scene. Other configuration management tools like Chef and Puppet have been around for many years and manage both Windows and non-Windows systems. However using them requires learning another language. DSC scripts use new extensions to PowerShell – most Windows administrators understand PowerShell (and if not, why not!?) so will find getting started a breeze.

What’s so good about DSC?

  • Standardisation. Create scripts containing the definition of different service types in your organisation (e.g. IIS, database, file server) then use these for all new deployments. You can be sure that all servers will be set up in an identical fashion – no more regional configuration problems or development servers that are different to those in production.
  • Speed up Deployment. It is quick and easy to apply a DSC configuration to a server via the PowerShell runtime
  • Detect configuration drift. DSC provides a way to determine whether a live server’s configuration matches that specified in the scripts and can report on or automatically correct the discrepancies.
  • Simplicity. As DSC is PowerShell based then anything you do in DSC can also be done using pure PowerShell. But with DSC you focus on what rather than how. Scripts have no logic or error handling and are easy to read.
  • Idempotency. You can apply the same DSC configuration over and over again and nothing will break. And if the configuration is updated, only the settings that differ will be changed.

DSC Components

  • WinRM (Windows Remote Management) – the Microsoft implementation of the industry-standard WS-Management protocol for managing servers using SOAP.
  • CIM (Common Information Model). A standard for describing the structure and behaviour of managed resources (such as storage, network, software components). WMI is the Windows implementation of CIM.
  • MOF files (Managed Object Format). Contains the configuration applied to the target node.
  • Resources These are the building blocks for a DSC configuration. DSC comes with a number of built in resources, such as File and Windows Feature, and you can also create your own.

Anatomy of a Resource

Every DSC resource exposes three functions.

  1. Test-TargetResource. This is the first method called when a DSC configuration is applied. It returns true/false based on whether the resource is in the correct state. If true then there’s no need for the DSC runtime to do anything else.
  2. Set-TargetResource. Called when Test-TargetResource returns false. Responsible for ensuring that the resource is set up following the specification stored in the configuration.
  3. Get-TargetResource. Returns all properties of the resource. Not used during the configuration step but used for reporting

A Basic DSC Configuration

The configuration below specifies that a folder named DSC_Demo should exist in c:\temp.

When the above PowerShell script is executed a MOF file named localhost.mof is created by the DSC runtime in the CreateFolderDemo folder. If you open the file it’ll look like this:

MOF Example

To apply the configuration – and “make it so” – use the Start-DscConfiguration cmdlet:

folder example
The directory didn’t exist before the script was run, so the Test-TargetResource function would return false. Next the Set-TargetResource function would be called and the directory created.

To show idempotency in action, if the script is run a second time then Test-TargetResource will return true and no further configuration takes place. Note that there are no errors and no attempt is made to create the folder a second time:

folder example try again

Something more meaty: Installing SQL Server using DSC

Using the xSqlPs PowerShell module you can perform a SQL Server installation via DSC.

First download the module then extract to the $env:ProgramFiles\WindowsPowerShell\Module directory. Open PowerShell and verify the modules are present when you run the Get-DSCResource command:

xSqlPs

Next create the configuration. In my example I am installing a SQL 2014 instance called DSCInstance onto the local machine:

Running the script will generate the localhost.mof in the InstallSQLDemo folder:

dir sqlinstall

Deploy the configuration with the command:

This will run SQL Server setup creating the new DSCInstance database engine instance.

deploy example

Like with our FileDemo above if we try to run Start-DscConfiguration a second time then it’ll complete without errors:

RunAgain

Note: the xSqlServerResource is designed for SQL 2012 and other versions will report an installation failure even if it was successful. This is because the resource looks for the SQL Setup log file in the ‘110’ directory but other versions of SQL Server keep the log file elsewhere. As a quick fix edit ‘C:\Program Files\WindowsPowerShell\Modules\xSqlPs\DSCResources\MSFT_xSqlServerInstall\MSFT_xSqlServerInstall.psm1′ and change the directory on line 154 to the relevant one for your SQL version (100 for 2008, 120 for 2014).

Conclusion

DSC is a great technology and I think it will make managing and deploying servers much easier in the future. However right now it is still in its infancy and support for anything other than basic actions is very limited. The SQL Server module doesn’t expose many of the command line switches available in SQL Setup and all of the large enterprise environments I have worked on have customisations not available in the current version. I’m sure things will improve and of course it’s possible to write your own resources to implement any custom features.

There are also several tasks which should be able to benefit from the power of DSC but right now there appears to be no solution. A good example is the deployment of Service Packs or Cumulative Updates to a SQL Server instance. A SQL Server DSC configuration should contain the version of SQL to deploy. Then when a new Service Pack is released all a DBA has to do is to update some configuration files and re-deploy, and the SQL Server resource will detect that only a patch needs to be installed rather than an entire deployment.

Posted in Automation, Deployment | Leave a comment

Change photo file date based on EXIF data

I want to transfer some of my favourite photos from last year’s bike tour to my Android tablet. All my pictures are stored in Aperture on my Mac and although it’s a great piece of software it has a very annoying feature. When you export a version of a photo it sets the date and time of the file created to the current date/time rather than the EXIF date time. This is annoying because the Android Gallery app uses the file date time to order photos.

So I created a PowerShell script that reads the EXIF data then changes the file timestamps so they match when the photo was taken. And instead of having to write something from scratch I found a couple of useful functions online that I could combine to do exactly what I wanted.

The first two useful functions come from Juri Cincura. He’s written a script to rename files based on EXIF dates which is almost what I want, but not exactly. So I just used the bits that are useful.

The function to change the file properties comes from the Scripting Guys Blog. I modified it slightly to work on just a single file but it’s at least 80% their code.

If you are just getting started with PowerShell then this is a great way to learn. Find some scripts that do something similar to what you want then modify them. It saves you a bunch of time and effort!

Posted in PowerShell, Scripting | Leave a comment

Database Corruption Challenge Part 1

Last weekend I saw a tweet mentioning Steve Stedman’s DataBase Corruption Challenge (DBCC) where he publishes a database that’s been corrupted and offers a challenge to the community to recover the database (hopefully) without any data loss. What a great idea! I had to enter and after some digging around figured out a way to bring the database online with all data intact.

Of the 21 successful entries there were many different solutions. Steve wrote that one even involved modifying the database files on disk using a hex editor. A hex editor?! Which crazy person would do that? Well… umm…. actually… that was me. This post covers how I found the problem and what I did to fix the corruption.

(You’ll note in my final solution I don’t open the database files in a hex editor however most of my workings out and modifications used one. For simplicity I’ve switched to using DBCC commands as they’re easier to repeat.)

Steve’s original blog covers how to get the corrupt database restored and ready for troubleshooting so I wont mention that; so let’s start with the DBCC CHECKDB error message instead.

The first line of the DBCC error message is very helpful as it quickly identifies the page that was corrupt (file 1 page 280) and the record in that page (record 3).

The object with object ID 245575913 is the table Revenue and we can use DBCC IND to find out that page 1:280 is part of the clustered index. Unfortunately this is a pain – because it’s a clustered index then a simple rebuild won’t fix the problem. Instead let’s look into the header of record 3 and try to figure out what’s wrong.

First we can view the page structure using DBCC PAGE:

At the end of the page is the record offset array which is used to locate individual records. From the offset array we can see that record 3 starts at offset 960 (0x3c0). (Note: row numbers actually start from 0. This is important as otherwise we will be wasting our time figuring out what’s corrupt about a non corrupt row. Not that that happened of course!)

Before looking at the actual corrupt record, a quick recap of the SQL Server record structure. A record comprises of the record header followed by the fixed length data, the NULL bitmap then finally data relating to the variable length columns (see Paul Randal’s blog for more information).

record_header

As we know the table where the corruption was found we can look at its schema to see how an individual record should look.

The four fixed length columns are all 4 byte integers so the fixed length portion of the record takes up 20 bytes.

record_fixed_length_portio

Note: I’ve deleted the last two columns for simplicity

Now let’s look at the corrupt record in detail. Below is the relevant part of the output of DBCC PAGE for record 3 in page 280.

dbcc page nohighlight

We can transpose the red portion of the record onto our table. Note that you need to read multibyte values from right-to-left. So a 2-byte value of 1400 in the DBCC PAGE output is actually 0x0014 . In my diagram and description below I’ve reversed the bytes so it’s easier to understand, but we’ll need to re-reverse them later when writing the fix back to the page.

record_transposed

The fixed length portion of the record isn’t interesting so I’ll skip it.

Bytes 20-23 contain the total number of columns in the record. The value is 0x6161 which is 24929 in decimal. There’s definitely something fishy as we know from looking at the table DDL that there are five columns, right?

Well…. not quite. Because the clustered index is non-unique there’s an additional uniquifier column which takes the total number of columns per record to six (to confirm this you could look at bytes 20-23 of row 2 and see the value 0x0006 (6 in decimal)).

So let’s change the value from 0x6161 to match the actual number of columns ( 0x0006 ).

record_header_fixed_colcount

To make the change we can use DBCC WRITEPAGE. This has the benefit of recalculating the page checksum for us as Steve was clever enough to make sure the page checksum was valid after his modifications. WRITEPAGE takes the file and page ID and also the offset in that page. In our case this is 980 bytes – 960 bytes to the start of the 3rd record and then 20 bytes into that record.

IMPORTANT NOTE! Please be careful running DBCC WRITEPAGE. It blindly trusts the input you give and therefore can very easily corrupt a database. Do not run it in production unless you fancy a career change!

Calling WRITEPAGE returns an error but maybe that fixed the problem?

Hmm. Something’s still not right. The error message still mentions record 3 of page 280 so let’s carry on looking at the broken record.

dbcc page blue

Byte 24 contains the NULL bitmap itself (in blue). The bitmap uses one bit per column so the total size should be number of columns/8. With six columns the bitmap will take up just a single byte. In our corrupt row the value is 0x61 which in binary is 11000010 . This took some investigating but in SQL 2014 unused parts of the NULL bitmap are set to 0. As there are only 6 columns this means the leftmost 2 bits should be 00 , but they’re not.

Reconstructing the NULL bitmap involved some guesswork. I decided to copy the NULL bitmap from row 2 (value 0x00 ) but it was possible that row 3 had some null columns which in row 2 weren’t null. In hindsight I could have used one of the techniques in Steve’s answer to view the data using a different index but I didn’t think of that at the time.

Adding the extra byte to DBCC WRITEPAGE results in the following command:

This time no error. Let’s run CHECKDB:

Looking better! Can we retrieve data from the table?

select-result

Yes – we can even see the record we fixed (in red).

The corruption has been fixed and the database is back online – it’s as simple as that!

Conclusion

Of course this is not something you want to be doing with production data. It’s one thing playing around on a Sunday afternoon but is a completely different activity under pressure from users to get their system online. In my environment I’d definitely be reaching for a recent backup instead!

Posted in SQL Server | Leave a comment

PowerShell try/catch Strangeness and Terminating Errors

For my sins I have been (and to some extents still am) a C# developer. Compared with scripting there is a lot I like about writing applications in fully fledged high-level languages, especially when it comes to flow control.

Not all scripting languages are created equal and something I really like about PowerShell is its support for .NET style try-catch-finally statements. Error handling in PowerShell v1 was pretty rubbish – anyone remember trying to figure out the trap statement? – but it’s now very good and gives a lot of control over how errors are caught and handled.

Essentially there are 2 types of errors in PowerShell – terminating and non-terminating. As you might expect given the name, if a non-terminating error occurs then statement execution still continues. This makes sense for many sysadmin operations where you don’t want one failure to cause your scripts to stop (such as a server being offline). Terminating errors cause statement execution to stop and PowerShell looks for an error handler instead (such as a catch statement). If none is found then the script will exit.

But something weird I noticed is that surrounding a command with a try-catch statement can make PowerShell change a non-terminating error into a terminating one.

This occurs when an exception is thrown from a .NET class. To show what happens, let’s use a very basic dummy class with a single static method that throws an exception:

Calling Bar() outside of try-catch will generate a non-terminating error. You can see that PowerShell continues execution as the “After” string is sent to the console. (It is important here that $ErrorActionPreference is Continue – if it is set to Stop then PS will treat all errors as terminating).

But put a try-catch around the call to Bar() and PowerShell changes the behaviour to a terminating error, and control is sent to the catch block instead of “After” being printed:

I don’t know why this happens but it’s important to know. If you add try-catch error handling to your PowerShell scripts – something I recommend as it makes your scripts more readable and maintainable – then you might end up changing their behaviour in ways you don’t expect. So be sure to test, test and test again!

Posted in PowerShell, Scripting | Leave a comment

PowerShell Snippets

This week I’ve been watching the recent Microsoft Virtual Academy on Advanced Desired State Configuration and Custom Resources.

It’s interesting to see what Microsoft are doing in this space given the competition has been around for years. DSC is in its infancy and there are still issues that need to be ironed out. Some of the demos in the MVA didn’t work and it wasn’t straightforward to troubleshoot them. However as DSC scripts are PowerShell based the learning curve for existing Windows administrators is much lower than with Chef or Puppet. So I think it does have a place and is something I’m excited about looking at in the future.

One of the presenters was the inventor of PowerShell, Jeffrey Snover. As a result I got some new PowerShell tips!

Wildcards in Cmdlet Names

Can’t completely remember the name of the cmdlet you want to use? It’s tough these days – for instance some of the Azure cmdlets are really long and difficult to memorise. Just use a wildcard in the cmdlet name, press tab and PowerShell will autocomplete with matching cmdlet names. Continue to press tab to cycle through the options.

Show-Object

It’s difficult to view the properties of complex objects in PowerShell. Yes there is Get-Member but it takes quite a bit of work to view everything. Lee Holmes, author of several PowerShell books has written a neat cmdlet called Show-Object which you pipe an object to. It then brings up a window with all the properties in a tree format. It even provides the expression to access that property in your PowerShell scripts.

2015-04-06_10-36-48

Posted in PowerShell, Scripting | Leave a comment

Configuring SQL Server to use Managed Service Accounts (in PowerShell)

When building out a new SQL Server environment, one consideration is the service account used to run SQL services. In SQL 2014 there are five options to choose from:

As long as there’s no need to access network resources then Virtual Accounts are best for new builds. They’re simpler than Built-In/Local accounts and provide the highest levels of separation. But if you do – for instance when using log shipping or database mirroring – then it’s better to use a low-privilege Domain Account or a Managed Service Account instead.

Domain Accounts have been around for a long time but using them correctly can be tricky:

  • How do you securely store the account passwords?
  • What about changing the password? How often? It is too easy to click the “Password Never Expires” box when creating the account
  • What is the process for reconfiguring services when the password is changed? Service Accounts are often shared across several systems. Forgetting to change the password somewhere means that service won’t come online
  • Did you remember to grant the account privileges to manage the ServicePrincipalName for Kerberos authentication?

But there is a better way!

Microsoft introduced Managed Service Accounts in Windows 2008 R2. SQL Server has supported them since 2008 R2 2012 (thanks Stephen!) and they make password management a breeze.

Managed Service Accounts have password management built in so administrators never have to change passwords themselves. AD will change the password automatically every 30 days and there’s no need to update the service configuration. MSAs also have the correct privileges in AD to manage ServicePrincipleNames. A MSA can only be linked to one computer at a time – they are so easy to use you wouldn’t want to share them anyway.

There are a couple of gotchas worth highlighting:

  • They require an AD Schema of 2008 R2 or later
  • To use all features the domain forest functional level must be 2008 R2 or later
  • MSAs cannot be used for SQL Server failover clusters (as all cluster nodes need to use the same service account)
  • If running Windows 2008 R2 or Windows 7 there is an important hotfix that fixes a problem with password changes not propagating – KB2494158

Using MSAs with SQL Server

The following PowerShell script creates a MSA in Active Directory for a SQL Server instance, then reconfigures the SQL Database Engine service account to use it.

Prerequisites/Notes:

  • The PowerShell script must be run as a Domain Administrator
  • It requires the PowerShell AD tools to be installed on your SQL server (although the script will install them for you if not already present)
  • Execute the script on the server running the SQL Server instance you want to configure

Cmdlet reference:

Granting Privileges to a MSA

Granting privileges to a MSA is very similar to granting privileges to a computer account. The account name of a MSA is in the format {Domain}\{MSAAccountName}$.

For instance, to grant a MSA access to a mirroring endpoint:

For shares/NTFS permissions be sure to select the “Service Account” object type before entering the account name as it is not enabled by default.

objectype

Notes

This post covers Managed Service Accounts – a new feature in Windows 2012 are Group Managed Service Accounts (gMSA). GMSAs can be shared across multiple servers so are perfect for clusters… except… they aren’t supported in SQL Server. Not that Microsoft make this particularly clear (you really have to hunt around to find an official statement).

Posted in Administration, Automation, PowerShell, Scripting | Leave a comment

Using Automation to shut down Azure VMs

Azure is great. For a test lab environment, why bother messing with virtual machines on your laptop when you can run them in the cloud instead? But, as anyone who has made this mistake will know, forgetting to shut down your Azure VMs can hit you in the wallet. The good news is that the Azure framework provides all the tools you need to make sure that it never happens again. All you need to do is set up an automated job (just like a scheduled task) that will shut down the VMs on schedule.

There are other posts and scripts on this subject but most use Azure automation certificates, which can be tricky to get working. My way is much simpler and uses Azure Active Directory. And compared to other solutions you get more control. The script can be set up to exclude certain VMs or Cloud Services, as you might want to shut down part of your Azure infrastructure but keep the rest online.

The two Azure components required to do this are:

  • Automation – hosting the runbook containing a schedule and the PowerShell scripts to run
  • Azure Active Directory – containing credentials used to authenticate with Azure during script execution

Creating the User Context

This step covers:

  • Creating a user under which whose context the PowerShell script to shut down the VMs is executed
  • Granting the user permissions over the Azure VM estate

Creating a User

First load the Azure portal and click on the Active Directory tab.

1_AzureAD

Select the Default Directory then click “Add User” at the bottom.

2_CreateUser

In the dialog that is displayed set the type of user as ‘New user in your organization’ and set the username to ScriptUser. Click Next.

3_adduser

Configure the new user with a name and display name and set the role to ‘User’.

4_configureUser

Azure will assign the new user a temporary password – click the Create button to display it

5_password

Note down the password shown.

6_displaypassword

The password must be changed so log out of the Portal, sign in using your new account (the user name will be ScriptUser@{youraccount}.onmicrosoft.com) and set a new password. Then log back in using the Azure Administrator account.

Granting Permissions

Next, the account needs to be set as a co-administrator. This grants it permissions to perform the shutdown operation. Go to Settings > Administrators then click Add.

7_GrantCoAdmin

Enter the full username of the ScriptUser user and click the green tick.

8_SpecifyCoAdminUser

The user is now set up to facilitate automation.

Configuring Automation

This step covers:

  • Creating the automation account
  • Creating a runbook, where the actual PowerShell script is stored
  • Defining a schedule which specifies how often the script is run

Creating the Automation Account

First, go to the Automation tab and create a new Automation Account.

9_CreateAutomationACct

To link our newly-created Automation Account to the user we just created, open the Automation Account, click the Assets tab then click New then select the “Add Credential” option.

10c_AddSettingToAutomationAcct

Choose the ‘Windows PowerShell Credential’ credential type, input the ScriptUser username and a description. Click Next.

11_AddCredentialToAutomationAcct

On the Define Credential page, enter the username again and the account’s password, then click the tick.

12_DefineCredential

Creating a Runbook

Next a Runbook must be created. The Runbook contains the PowerShell code to control the VMs and a schedule defining how frequently the script should be run.

Give the runbook a name and description and configure it to use the automation account created in the previous step.

13_CreateRunbook

Azure will create the runbook – click the Edit Runbook link to load up the runbook properties.

14_EditRunbook

Open the new Runbook then click the Author tab. This displays a simple PowerShell script editor.

15_AuthorTab

Input the script below.

Click the Publish button to save and publish the Runbook script.

16_Publish

Defining a Schedule

Next a schedule needs to be defined on the runbook. This controls how often the runbook is executed. Click the Schedule link at the top of the runbook page.

17_Schedule

Create a new schedule and give it a name and description.

18_ConfigureSchedule

Configure the occurrence of the schedule – I want my VMs to shut down at 11pm every night.

19_ConfigureScheduleOccurrence

The schedule is now visible on the runbook page.

20_ViewSchedule

Testing

Start some VMs.

21_TestVMs

Go to the runbook and click the “Start” button at the bottom of the window.

22c_runtest

The runbook will begin and start to shut down the Azure VMs. Within a couple of minutes all your VMs will be off and there’s more money in your pocket to spend on coffee.

23_results

Other Notes

Something I learned writing this post is that positional Powershell parameters do not work when writing Azure workflows! You need to be more specific with parameters, for example using the -Filterscript parameter.

Posted in Azure, PowerShell, Scripting | Leave a comment

Bike Tour Business Intelligence: Part 6 [Building the Fact Table]

At this stage the BI database contains the Date, Time and Location Dimension tables as well as a staging table with all the points captured by my GPS. This post covers creating and populating the Fact table, as well as running some queries over the dataset.

To recap, the star schema design has three Dimension tables all related to the Fact table, as shown below:

Star schema GPX

Creating the Fact Table

The Fact table creation flow involves two steps – the creation and population of the Fact table itself, followed by the creation of a clustered index (on all key columns) and the foreign key relationships to the dimension tables.

part6_fact_design

The SQL SELECT .. INTO query makes it easy to create and populate the Fact table:

With some additional Execution SQL Tasks to prepare the database for the data load and to clear down staging tables, the final SSIS workflow is complete (click to view the full workflow):

part6_workflow_thumb

Click to see the workflow in detail

Running it against all my GPX files results in a Fact table of just over 350,000 rows.

Some sample queries

With all the data loaded it’s hard to resist writing some queries…. which I did, and then promptly found some outlying data points. Cycling in the Czech Republic was not a highlight and we took a shortcut through Prague by train. As I still had my GPS recording, it captured the journey leaving a couple of 50km+ segments which skew the data. I added a filter to the Fact table generation query so that only points under 10km apart are imported, ensuring only data generated by cycling is available for analysis.

What days did we cycle the farthest?

part6_longestdays

What’s the longest daily ride per country?

part6_longest_per_country

Did we ride further in the morning morning or the afternoon?

part6_AmPm

Which day of week did we have the highest moving average speed (assume moving is speed > 5kmh)?

Answer…. none in particular!

part6_longestday

Summary so far

The DW database now contains all the points recorded by my GPS – next up I will be creating an Analysis Services Cube and performing some analytics over the dataset.

Posted in SQL Server, SSIS | 2 Comments

Bike Tour Business Intelligence: Part 5 [Dimensions]

Our star schema has three dimensions – Date, Time and Location. This post will cover populating the dimension tables in preparation for cube generation.

Date Dimension

Date

There are lots of date dimension generation scripts online, Mike Davies’ script is good, I and modified it to remove some columns that aren’t useful. The table is created and populated using an Execute SQL Task.

Time Dimension

Time

Another Execute SQL Task to run a script written by Mike Davies, modifying it to use the SQL Server time data type for the Time column.

Both tasks are kept in a Sequence Container run after the staging table generation flow.

part5_dimdatetime

Location Dimension

Location

The Location Dimension’s based off the NUTS dataset, storing the shapefile and the denormalised location data for each area. The overall flow is slightly more complex than for the date and time dimensions – first the DimLocation table is created, then populated and then some indexes are created.

DimLocation Container Design

DimLocation Container Design

Populating the DimLocation table is a straightforward Extract-Load Data Flow Task running a query against the Nuts2008_Geog and Nuts2008_Level3 tables we imported into SQL Server in Part 3. The data source is a SQL query which retrieves the shape data alongside the NUTS Level 3/2/1/0 region codes and names, and the results are saved to an OLE DB Destination pointing to the BI database.

The SQL query to get the shapes and location data (and sample result set) is below.

part5_queryresults

Finally two indexes are built on the table – the clustered index and a spatial index on the Shape column.

Posted in SQL Server, SSIS | Leave a comment