Joe Salvatore called out for help on an interesting use of the SSIS Lookup component: Can you lookup

a row using a value within a range?

Joe specified his criteria as:

  1. DataStagingSource.ModifyDate < DataWarehouseDimension.RowExpiredDate AND
  2. DataStagingSource.ModifyDate >= DataWarehouseDimension.RowEffectiveDate AND
  3. DataStagingSource.NaturalKey = DataWarehouseDimension.NaturalKey

Easy! To show how it is done I’ve created a test database “SCRATCH” on my local machine, and

created two tables and some data with the following script:

CREATE Table dbo.sourcedata (naturalkey varchar(32), modifydate smalldatetime)

insert into dbo.sourcedata (naturalkey, modifydate) values (‘a’,’1 Jan 2006′)

create table dbo.lookupdimension (naturalkey varchar(32), roweffectivedate smalldatetime,

rowexpireddate smalldatetime, surrogatekey int)

insert into dbo.lookupdimension (naturalkey, roweffectivedate, rowexpireddate, surrogatekey)

values (‘a’, ’11 dec 2005′, ’28 feb 2006′, 1)

For an example of what we want to get in our output, run:

SELECT  sourcedata.naturalkey,
sourcedata.modifydate,
lookupdimension.surrogatekey
from sourcedata
LEFT OUTER join lookupdimension
on sourcedata.naturalkey = lookupdimension.naturalkey
and sourcedata.modifydate >= lookupdimension.roweffectivedate
and sourcedata.modifydate < lookupdimension.rowexpireddate

Create a SSIS package with a connection to the database. Add a Dataflow task, and a OLEDB

Source, Lookup, and some destination for the data (I used a Flat File Destination as it requires

the least configuration when you just want a proof of concept). Here's how my DataFlow task

looks:

 

Make sure the OLE DB Source connecter selects all the columns from the dbo.sourcedata

table. Hook up the OLE DB Source and Lookup.
Then open the properties of the Lookup component by double-clicking it, or right-click and

choose Properties. Select the dbo.lookupdimension to be used as lookup source.

 

In the Columns tab, make sure the naturalkey has a relationship between the tables. Drag

and drop the modifydate column to roweffectivedate. Select surrogatekey as

the output.

 

Now here is where the trick comes in. Jump to advanced, and tick Enable memory

restriction, then Modify the SQL Statement. You can now modify the select statement

that is used by the component to perform lookups. Lets set it up with the same rules that Joe

wants:

select * from (select * from [dbo].[lookupdimension]) as refTable where [refTable].[naturalkey]

= ? and ? >= [refTable].[roweffectivedate] and ? < [refTable].[rowexpireddate]

 

Now click the Parameters button. You should be able to set three parameters here.

naturalkey should be the input column for the first parameter, with modifydate for
both the other two.

Configure your destination, and add a Data Viewer to the flow if you want to be able to see the

results.

 

Your lookup should be able to locate the row in the dbo.lookupdimension table and return

the surrogatekey value.