Email Article To a Friend View Printable Version

How To: Configure a For Loop Container

Integration Services 2005 / SSIS (DTS)
  1. Start in the package Control Flow, with no object selected (Press ESC in the main window).
  2. Right click the background in the package, and select Variables

  3. Add a variable called LoopIteration with the Int32 data type in the package.
  4. Add a For Loop Container to the package

  5. Edit the For Loop Container by double-clicking it, or right-clicking it and choosing Edit.
  6. Set the InitExpression to @[User::LoopIteration]=0
  7. Set the EvalExpression to @[User::LoopIteration]<5 where 5 would is the number of loops you want to run.
  8. Set the AssignExpression to @[User::LoopIteration]=@[User::LoopIteration]+1
  9. Your settings should now look like this:



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz




Email Article To a Friend View Printable Version

Converting SSIS Data Types to .Net Data Types

Integration Services 2005 / SSIS (DTS)

If you use Scripting tasks, the CLR or build components for SSIS, you will need to convert values from their SSIS types into .Net types. Here's a quick list of  the SSIS data types and their .Net companion.

Integration Services
Data Types
Managed
Data Type
DT_WSTR
DT_TEXT
DT_STR
 
System.String
 
DT_BYTES
DT_IMAGE
 
System.Byte[]
 
DT_DBTIMESTAMP
DT_DATE
 
System.DateTime
 
DT_NUMERIC
DT_DECIMAL
 
System.Decimal
 
DT_I1
 
System.Byte
 
DT_I2
 
System.Int16
 
DT_I4
 
System.Int32
 
DT_I8
 
System.Int64
 
DT_BOOL
 
System.Boolean
 
DT_R4 
 
System.Single
 
DT_R8 
 
System.Double
 
DT_UI1
 
System.Byte
 
DT_UI2
 
System.UInt16
 
DT_UI4
 
System.UInt32
 
DT_UI8
 
System.UInt64
 
DT_GUID System.Guid

Books Online also lists these conversions.



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

Generating a new SSIS Package GUID with the dtutil Utitlity

Integration Services 2005 / SSIS (DTS)

If you make a copy of a .dtsx file, the Package ID property remains the same. This value is logged as the SourceID, and to ensure it is unique, you should generate a new value for any copies of a package.

dtutil /FILE "Copy of sample.dtsx" /IDRegenerate

Or the short hand version

dtutil /FILE "Copy of sample.dtsx" /I


Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

Simon Sabin's SSIS Raw File Reader

Integration Services 2005 / SSIS (DTS)

Ever used a Raw File destination in SSIS ? They're FAST! Writing rows from the pipeline to disk as quickly as possible. Raw Files are useful for increasing the throughput speed of a data flow task by using them in Extract / Transform procedures.

The reason they are so efficient is they stored the data in a big binary file. Try and read your data from these optimised files, and you'll have a lot of trouble figuring out whats what. Simon's saved us all the hassle!

Download his new tool and simply point it at your raw file. It will load the data so you can browse through it in a data grid. Much easier!

If you are having troubles selecting your files (I've been using the file extension .dtsxraw on mine)  just type *.* and hit enter in the open file box to see all the files in your folder.  Simon, that might be your first bug to fix!



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

Using Expressions with SSIS Maintenance Tasks and SelectedDatabases

Integration Services 2005 / SSIS (DTS)

Its been noted by a few people that Expressions cannot be used to configure the Shrink Database Task at runtime. The problem is that the SelectedDatabases property is a collection which cannot be created by an expression. The same problem occurs in most of the Maintenance Task items of SSIS.

Maintenance Plan Task components are simple wrappers to existing sql commands. The sql that is run by the Shrink Database Task is similar to:

USE [scratch]
GO
DBCC SHRINKDATABASE(N'scratch', 1, TRUNCATEONLY)

Instead of using the Shrink Database Task, I prefer using an Execute SQL Task and applying an expression to that. This also gives you greater control of the shrinking process: you can just shrink an individual data/log file if you wished.

Replacing the Shrink Database Task with the more flexible Execute SQL Task

Remove the Shrink Database Task from the package, and add a Execute SQL Task in its place.

Assuming that the name of the database you want to shrink is in the variable DBName, you would set the Expression for the SqlStatementSource property of the Execute SQL Task to:

"DBCC SHRINKDATABASE(N'" + @[User::DBName] + "', 1)"

So Simple.

If you are wanting to shrink multiple databases, you can place the Execute SQL Task inside a For Each Container.

You could use the DBCC SHRINKFILE command if you to shrink just the log or an individual data file. DBCC SHRINKDATABASE shrinks both all the log and data files for a database.

If you must use the Maintenance Plan Task component, Kirk Haselden has written an example of a self modifying package at runtime. You could use this method however in this case it might make the problem more complex.



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

No!!! Undo, Undo, Undo!!

Integration Services 2005 / SSIS (DTS)

Or rather, why is there no "undo" in when working with Integration Services packages?

This has been something that has me puzzled me since working with Beta 2 of Integration Services. I guess I figured it would be fixed by RTM, and just worked around the issue by regularly checking in finalised changes, and saving only when I'm sure of a changes I've made.

But it hardly seems logical that while Business Intelligence studio is built on the Visual Studio IDE, that undo doesn't work. Even simple undos such as undoing changes to Annotation Textboxes, or undoing layout changes of diagrams in Control / Data Flows. Gah! I meant to double click on that task, not move it! Undo! Please?

Under the covers, SSIS Packages are saved as XML files. So why is it Business Intelligence Studio cannot simply keep a history of delta changes to that xml document? Even if its too difficult to undo the delta changes between versions, it should be simple enough to refresh (or open) from the change history and reset the working environment.

In the meantime though, I guess I'll just have to keep using the "undo checkout" feature of SourceSafe.

If you want to vote on having "undo", or other changes included in the tools for the next version of SQL Server, have a look at the MSDN Product Feedback Centre for SQL Server.



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

SSIS Lookup with value range

Integration Services 2005 / SSIS (DTS)

Joe Salvatore called out for help on an interesting use of the SSIS Lookup component: Can you lookup a row using a value within a range?

Joe specified his criteria as:

  1. DataStagingSource.ModifyDate < DataWarehouseDimension.RowExpiredDate AND
  2. DataStagingSource.ModifyDate >= DataWarehouseDimension.RowEffectiveDate AND
  3. DataStagingSource.NaturalKey = DataWarehouseDimension.NaturalKey

Easy! To show how it is done I've created a test database "SCRATCH" on my local machine, and created two tables and some data with the following script:

CREATE Table dbo.sourcedata (naturalkey varchar(32), modifydate smalldatetime)

insert into dbo.sourcedata (naturalkey, modifydate) values ('a','1 Jan 2006')


create table dbo.lookupdimension (naturalkey varchar(32), roweffectivedate smalldatetime, rowexpireddate smalldatetime, surrogatekey int)

insert into dbo.lookupdimension (naturalkey, roweffectivedate, rowexpireddate, surrogatekey) values ('a', '11 dec 2005', '28 feb 2006', 1)

For an example of what we want to get in our output, run:

SELECT  sourcedata.naturalkey,  
	sourcedata.modifydate, 
	lookupdimension.surrogatekey 
from sourcedata
LEFT OUTER join lookupdimension
        on sourcedata.naturalkey  =  lookupdimension.naturalkey
        and sourcedata.modifydate >= lookupdimension.roweffectivedate
        and sourcedata.modifydate <  lookupdimension.rowexpireddate

Create a SSIS package with a connection to the database. Add a Dataflow task, and a OLEDB Source, Lookup, and some destination for the data (I used a Flat File Destination as it requires the least configuration when you just want a proof of concept). Here's how my DataFlow task looks:

 

Make sure the OLE DB Source connecter selects all the columns from the dbo.sourcedata table. Hook up the OLE DB Source and Lookup. Then open the properties of the Lookup component by double-clicking it, or right-click and choose Properties. Select the dbo.lookupdimension to be used as lookup source.

 

In the Columns tab, make sure the naturalkey has a relationship between the tables. Drag and drop the modifydate column to roweffectivedate. Select surrogatekey as the output.

 

Now here is where the trick comes in. Jump to advanced, and tick Enable memory restriction, then Modify the SQL Statement. You can now modify the select statement that is used by the component to perform lookups. Lets set it up with the same rules that Joe wants:

select * from (select * from [dbo].[lookupdimension]) as refTable where [refTable].[naturalkey] = ? and ? >= [refTable].[roweffectivedate] and ? < [refTable].[rowexpireddate]

 

Now click the Parameters button. You should be able to set three parameters here. naturalkey should be the input column for the first parameter, with modifydate for both the other two.

Configure your destination, and add a Data Viewer to the flow if you want to be able to see the results.

 

Your lookup should be able to locate the row in the dbo.lookupdimension table and return the surrogatekey value.



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

How to use OUTPUT parameters with SSIS Execute SQL Task

Integration Services 2005 / SSIS (DTS)

Yesterday while trying to get OUTPUT parameters to work with SSIS Execute SQL Task I encountered a lot of problems, which I'm sure other people have experienced. BOL Help is very light on this subject, so consider this the lost page in help.

The problem comes about because different providers expect parameters to be declared in different ways. OLEDB expects parameters to be marked in the SQL statement with ? (a question mark) and use ordinal positions (0, 1, 2...) as the Parameter name. ADO.Net expects you to use the parameter name in both the SQL statement and the Parameters page.

In order to use OUTPUT parameters to return values, you must follow these steps while configuring the Execute SQL Task:

For OLEDB Connection Types:

  1. You must select the OLEDB connection type.
  2. The IsQueryStoredProcedure option will be greyed out.
  3. Use the syntax EXEC ? = dbo.StoredProcedureName ? OUTPUT, ? OUTPUT, ? OUTPUT, ? OUTPUT The first ? will give the return code. You can use the syntax EXEC dbo.StoredProcedureName ? OUTPUT, ? OUTPUT, ? OUTPUT, ? OUTPUT to not capture the return code.
  4. Ensure a compatible data type is selected for each Parameter in the Parameters page.
  5. Set your parameters Direction to Output.
  6. Set the Parameter Name to the parameter marker's ordinal position. That is the first ? maps to Parameter Name 0. The second ? maps to Parameter Name 1, etc.

For ADO.Net Connection Types:

  1. You must select the ADO.Net connection type.
  2. You must set IsQueryStoredProcedure to True.
  3. Put only the stored procedure's name in SQLStatement.
  4. Ensure the data type for each parameter in Parameter Mappings matches the data type you declared the variable as in your SSIS package.
  5. Set your parameters Direction to Output.
  6. Set the Parameter Name to the same name as the parameter is declared in stored procedure.

For other connection types, check out the table on this page

Note: if you choose the ADO/ADO.Net connection type, parameters will not have datatypes like LONG, ULONG, etc. The datatypes will change to Int32, etc. Make sure that the datatype is EXACTLY the same type as the Variable in your package is defined. If you choose a different datatype (bigger/smaller/different type) you will get the error:

Error: 0xC001F009 at Customers: The type of the value being assigned to variable "User::Result_CustomerID" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
Error: 0xC002F210 at Add New Customer, Execute SQL Task: Executing the query "dbo.AddCustomer" failed with the following error: "The type of the value being assigned to variable "User::Result_CustomerID" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
To fix this error make sure the datatype you select for each parameter in the Parameters page exactly matches the datatype for the variable.

If you have attempted to use a connection type other than ADO.Net with named parameters you will recieve this error:

Error: 0xC002F210 at Add New customer, Execute SQL Task: Executing the query "exec dbo.AddCustomer" failed with the following error: "Value does not fall within the expected range.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
Named parameters can only be used with the ADO.net connection type. Use ordinal position numbering in order to use OUTPUT parameters with the OLEDB connection type. Eg: 0, 1, 2, 3, etc.

 

OUTPUT parameters are extremely useful for returning small fragments of data from SQL Server, instead of having a recordset returned. You might use OUTPUT parameters when you want to load a value into a SSIS Package variable so that the value can be reused in many places. The data that is output might be used for configuring / controlling other Control Flow items, instead of being part of a data flow task.

If you were using output parameters in Management Studio, your SQL statement might look something like:

DECLARE @Name       nvarchar(125)
DECLARE @DOB        smalldatetime
DECLARE @CustomerID int
EXEC dbo.AddCustomer @CustomerName = @Name, @CustomerDOB = @DOB, @CustomerID = @CustomerID OUTPUT
PRINT @CustomerID

If you attempt to use the same syntax (highlighted above) with an Execute SQL Task you could end up with the error message:

Error: 0xC002F210 at Add New customer, Execute SQL Task: Executing the query "EXEC dbo.AddCustomer @CustomerName = @Name" failed with the following error: "Must declare the scalar variable "@Name".". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

 

The only hint SQL Server 2005 Books Online gives is:

QueryIsStoredProcedure
Indicates whether the specified SQL statement to be run is a stored procedure. This property is read/write only if the task uses the ADO connection manager. Otherwise the property is read-only and its value is false.
(from SSIS Designer F1 Help > Task Properties UI Reference > Execute SQL Task Editor (General Page) )

There's a number of pages in Books Online that address Parameter use with the Execute SQL Task, but none adaquately address using output parameters. Articles which could do with updating:

  • How to: Map Query Parameters to Variables in an Execute SQL Task
  • Execute SQL Task Editor (Parameter Mapping Page)
  • Execute SQL Task Editor (General Page)
  • Execute SQL Task
  • Execute SQL Task (Integration Services)



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

Response to SSIS: Be wary of using SourceSafe

Integration Services 2005 / SSIS (DTS)

Jamie's post touches on one potential problem of having SSIS packages that are too big: errors with Source Safe while checking in. SSIS Packages are saved as xml documents, SourceSafe is going to put in a lot of effort to save the delta changes of the xml between saves. While I haven't encountered this problem in my environment, Jamie does make a suggestion which is very valuable: "Keep your packages as small as can be. A modular approach to development can be of real benefit".

I'd definitely agree with Jamie on this approach, especially when using SourceSafe in a team development environment. Because SSIS Packages are XML, you cannot assume that SourceSafe is going to safely merge changes from shared checkouts. In short: you cannot have multiple developers working on the same package at once.

When designing/developing a package, ensure only a single task / dataflow is put into a Data Flow task. Group Control Flow tasks with a Sequence Container when you can. When you find your package is becoming too big, or you need to have more than one developer working on it at once, you can easily move the group to a new package, and use an Execute Package Task in the parent to keep the control flow.

In addition to thereby enabling your project to have more developers working at once, you are also gaining finer control of packages and tracking of changes. In an article on Event Handler Issues, Daniel Read takes issue that Event Handlers fire too often. If his packages had been more modular, with a parent package calling multiple (sub)packages, enabling events only in the parent package would be suffice. Event Handlers in the child packages would not be necessary.

So some quick tips on using SSIS with SourceSafe:

  1. Use exclusive checkouts (not shared checkouts) for SSIS Package files.
  2. Group Control Flow tasks with Sequence Containers in case you want to break up a package later.
  3. Use the Execute Package Task to call other packages from a parent package.
  4. Log the Package Version with your custom Event Handlers so you know what version ran when.

On Jamie's other note to backup your SourceSafe: thatís a definite. I would backup any resource that is part of your work; source code is especially precious. SourceSafe can be used for tracking changes and little incremental backups of daily work. But backing up SourceSafe helps you recover from detrimental changes to SourceSafe (like permanently deleting a SourceSafe project) or from corruption of SourceSafe itself.



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz






Email Article To a Friend View Printable Version

SSIS: Handling Lookup Misses

Integration Services 2005 / SSIS (DTS)

Let me direct you to an article by Ash on speed differences between two approaches to handing failed lookups in SQL Server Integration Services.

Failed lookups are likely to happen at some point, and in most cases you won't want your package to fail as a result, but rather a default or derived value or other handling routine should be used.

There are plenty of ways to handle redirected rows as a result of a failure. Ash provides a good reason for why ignoring the failures (instead of redirecting the row) is a better option: memcopy. In order for your row to go down a different path, the data must be copied into that path's buffer. Ignoring the failure means no copying of the row.

While the timings aren't a significate difference for a single lookup. If you were loading a huge amount of data into multiple fact tables, with lots of lookups, you could gain a lot of performance overall. Ash's test shows why benchmarking different methods can often find a more efficient method.



Share
  • Facebook
  • Google Bookmarks
  • Ask
  • LinkedIn
  • Socialogs
  • Wikio
  • Digg
  • Twitter
  • SlashDot
  • Reddit
  • MySpace
  • Del.icio.us
  • Blogter
  • BlogMemes
  • Yahoo Buzz







Gold Coast Aquarium Maintenance | Gold Coast Marine Fish | Gold Coast Tropical Fish
Jewel Jones - Counselling Service - Penrith, Richmond
 
     
 Copyright © 2017 Julian Kuiters
 All trademarks and copyrights on this page are owned by their respective owners.