Satellite table template used in the DWH Deck
The next version of the DWH Deck will support templates. This is the documentation of the satellite table template.
Variables/placeholders used:
@Database, the name of the data vault database e.g.: DWH_dv
@MetaDatabase, the name of the meta database e.g.: DWH
@BusinessArea, the name of the business area, implemented as schema e.g.: sales
@Table, the name of the satellite table e.g.: Categories_hub_sat
@ParentTable, the name of the related hub or link for this satellite
@SourceTableName, the full identifying name of the related staging table e.g.: [DWH_st].[sales].[Northwind_dbo_Categories]
@ModelTableID, is used to track the model table to the audit table
@Sqn, the suffix that is used for the surrogate key in the link or hub (parent)table
@SourceNumber, when there is more than one source for the link table
@LookupParentSqn, a sql statement to find the related parent identifier like:
SELECT MAX([Categories_hub_sqn])
FROM [sales].[Categories_hub] R WITH (NOLOCK)
WHERE (st.[CategoryID = R.[CategoryID] OR (st.[CategoryID] IS NULL AND R.[CategoryID] IS NULL)
@@ColumnList, a list of columns that are part of the satellite:
[CategoryName]
, [Description]
@@ColumnTableList, a list of columns with the data type defining the table:
[CategoryName] varchar(64), [Description] varchar(max)
@@ColumnViewList, a list used in the view to get the related column info from the staging area
[CategoryName] = st.[CategoryName], [Description] = st.[Description]
@@ColumnWhereList, a list of Where statements checking existents in target satellite table
(src.[CategoryName] = trgt.[CategoryName] OR (src.[CategoryName] IS NULL AND trgt.[CategoryName] IS NULL)
AND (src.[Description] = trgt.[Description] OR (src.[Description] IS NULL AND trgt.[Description] IS NULL)
Satellite Table Template
The SQL statement used to create the hub table:
USE [@Database]
GO
--save old data (if any) into backup of table with prefix _
IF EXISTS (SELECT 1 FROM [sys].[objects]
WHERE object_id=OBJECT_ID(N'[@BusinessArea].[@Table]') AND type in (N'U'))
BEGIN
IF (SELECT COUNT(*) FROM [@BusinessArea].[@Table])>0
BEGIN
IF EXISTS (SELECT 1 FROM [sys].[objects]
WHERE object_id=OBJECT_ID(N'[@BusinessArea].[_@Table]') AND type in (N'U'))
DROP TABLE [@BusinessArea].[_@Table];
SELECT * INTO [@BusinessArea].[_@Table]
FROM [@BusinessArea].[@Table];
END
DROP TABLE [@BusinessArea].[@Table]
END
--create table
CREATE TABLE [@BusinessArea].[@Table]
( [@ParentTable@Sqn] INT NOT NULL
, [LoadDate] DATETIME NOT NULL
, [LoadDateEnd] DATETIME
, [AuditID] INT
, @@ColumnTableList
, CONSTRAINT [pk_@Table] PRIMARY KEY ([@ParentTable@Sqn],[LoadDate])
, CONSTRAINT [fk_@Table] FOREIGN KEY ([@ParentTable@Sqn]) REFERENCES [@ParentTable]
);
CREATE INDEX [ix_@Table] ON [@Database].[@BusinessArea].[@Table]
( [LoadDate]
, [LoadDateEnd]);
The part of the statement to drop and recreate the table is rather large (as in all data vault tables) This is helpful if you have accumulated some history in your link table and you made a change in the definition. All data is first stored in a new table with the prefix "_" before dropping the table. Once you have created your new link table you can still access the old data and import it manual from this _table to the newly created link table.
Satellite View Template
The SQL statement used to create a view which is used to import the data from the staging table:
USE [@Database]
GO
--drop previous view if it exists
IF EXISTS (SELECT 1 FROM [sys].[views]
WHERE object_id = OBJECT_ID(N'[@BusinessArea].[vwImport_@Table@SourceNumber]'))
DROP VIEW [@BusinessArea].[vwImport_@Table@SourceNumber];
--create view
USE [@Database]
GO
--@
CREATE VIEW [@BusinessArea].[vwImport_@Table@SourceNumber]
AS
SELECT DISTINCT TOP 100 PERCENT src.*
FROM (SELECT
[@ParentTable@Sqn]= ISNULL((@LookupParentSqn
),0)
@@ColumnViewList
FROM @SourceTableName st WITH (NOLOCK)) src
--check if source (src) row exists in target (trgt)
WHERE NOT EXISTS (SELECT 1
FROM [@BusinessArea].[@Table] trgt WITH (NOLOCK)
WHERE ISNULL(src.[@ParentTable@Sqn],0)=ISNULL(trgt.[@ParentTable@Sqn],0)
AND @@ColumnWhereList))
ORDER BY [@ParentTable@Sqn]
Satellite ETL Template
The SQL Statement to create a stored procedure that is used to import the data:USE [@Database]
GO
--drop previous etl-procedure if it exists
IF EXISTS (SELECT * FROM [sys].[objects]
WHERE object_id = OBJECT_ID(N'[@BusinessArea].[uspImport_@Table@SourceNumber]')
AND type in (N'P', N'PC'))
DROP PROCEDURE [@BusinessArea].[uspImport_@Table@SourceNumber]
--create etl-procedure
USE [@Database]
GO
--@
CREATE PROCEDURE [@BusinessArea].[uspImport_@Table@SourceNumber]
AS
--get the audit_id
DECLARE @AuditID INT
INSERT INTO [@MetaDatabase].[dbo].[Audit] (StartDate, ModelTableID)
VALUES( getdate(), @ModelTableID);
SET @AuditID=@@IDENTITY
DECLARE @DATE DATETIME;
DECLARE @ROWSINSERTED INT;
SET @DATE=GETDATE();
--insert into target
INSERT INTO [@BusinessArea].[@Table]
( [AuditID]
, @@ColumnList
) SELECT
@AuditID AS AuditID
, @DATE AS LoadDate
, * FROM [@BusinessArea].[vwImport_@Table@SourceNumber] ;
-- get rowCount of inserted rows
SET @ROWSINSERTED=@@ROWCOUNT
--enddate the old entries of this satellite
UPDATE [@BusinessArea].[@Table]
SET LoadDateEnd=@DATE
WHERE [@ParentTable@Sqn] IN
(SELECT [@ParentTable@Sqn]
FROM [@BusinessArea].[@Table]
WHERE LoadDateEnd IS NULL
GROUP BY [@ParentTable@Sqn]
HAVING COUNT(*)>1 )
AND LoadDate<@DATE
-- update audit table with the number of rows
UPDATE [@MetaDatabase].[dbo].[Audit]
SET RowsInserted=@ROWSINSERTED, RowsUpdated=@@ROWCOUNT, EndDate=getdate()
WHERE AuditID=@AuditID;
