Bike Tour Business Intelligence: Part 5 [Dimensions]

Our star schema has three dimensions – Date, Time and Location. This post will cover populating the dimension tables in preparation for cube generation.

Date Dimension

Date

There are lots of date dimension generation scripts online, Mike Davies’ script is good, I and modified it to remove some columns that aren’t useful. The table is created and populated using an Execute SQL Task.

Time Dimension

Time

Another Execute SQL Task to run a script written by Mike Davies, modifying it to use the SQL Server time data type for the Time column.

Both tasks are kept in a Sequence Container run after the staging table generation flow.

part5_dimdatetime

Location Dimension

Location

The Location Dimension’s based off the NUTS dataset, storing the shapefile and the denormalised location data for each area. The overall flow is slightly more complex than for the date and time dimensions – first the DimLocation table is created, then populated and then some indexes are created.

DimLocation Container Design

DimLocation Container Design

Populating the DimLocation table is a straightforward Extract-Load Data Flow Task running a query against the Nuts2008_Geog and Nuts2008_Level3 tables we imported into SQL Server in Part 3. The data source is a SQL query which retrieves the shape data alongside the NUTS Level 3/2/1/0 region codes and names, and the results are saved to an OLE DB Destination pointing to the BI database.

The SQL query to get the shapes and location data (and sample result set) is below.

part5_queryresults

Finally two indexes are built on the table – the clustered index and a spatial index on the Shape column.

This entry was posted in SQL Server, SSIS. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *