The System Administrator Notebook

Designing Workflow in SSIS Packages

Page Contents

The Control Flow tab in the Package Designer Window of BIDS is used to design workflow for the tasks that constitute the package. Tasks are added to the control flow via the Toolbox. Double-clicking the task in the Control Flow window will open the Task Editor, which is used to configure the Task

The Expressions tab on a task, is used to set values dynamically at runtime and can hold constant values, SSIS variables or expressions.

All task expose a set of properties that configure behaviour of the task:

Name
Name of task
Description
description of task instance, used as tooltip text
Disable
prevents the task from executing. Useful when testing a package to prevent specific tasks from running
DelayValidation
properties set in the task are not validated until runtime. Useful for building a task when disconnected from data sources
ExecValueVariable
Used to specify a variable to store the execution output for the task
FailPackageOnFailure
Used to fail package if task fails. Normally you would want to setup error handlers to deal with task failures
FailParentOnFailure
Used to fail parent package if task fails
TransactionOption
Used to enable/disable transactions for then task
IsolationLevel
Set the transaction isolation-level for the task
LoggingMode
Used to set logging options for the task

All the tasks in the Control Flow can be grouped into a variety of containers. By default a single task is placed in a Task Host container. A group container allows you to organise a group of tasks visually on the Control Flow design surface, but does not add functionality to the group. By contrast, a Sequence container collects a group of tasks together and provides a scope for:

  • properties
  • variables
  • execution constraints for external tasks
  • event handlers

A 'For Loop' container allows you to iterate over a group of tasks according to the value of a variable. The Loop is configured via three properties of the container:

  1. InitExpression - to initialise loop variable
  2. EvalExpression - to test condition controlling loop entry
  3. AssignExpression - to change value of loop variable

A 'Foreach Loop' container allows you to iterate a group of tasks for each member of a defined collection.

Script Tasks

Top Bottom

There are two available types of Scripting Task:

  1. ActiveX Script Task
  2. Script Task

ActiveX Script tasks are included only for backwards compatibility. Script tasks allow access to a Visual Studio Tools for Applications (VSTA) environment to write and execute scripts in either C#.NET or VB.NET.

The Edit Script button on the Script Task Editor will open a Visual Studio shell with a template in the chosen scripting language. Each script should end by assigning to 'Dts.TaskResult', which is used to determine the task result. In C# this is done as follows:

public void Main()
{
  if ( we were successful ) { 
	Dts.TaskResult = (int)ScriptResults.Success; 
  } else { 
	Dts.TaskResult = (int)ScriptResults.Failure; 
  }
}

Package variables are made available to scripts by setting the ReadOnlyVariables and ReadWriteVariables properties in the Script Task Editor. The variables listed in these properties can be accesssed in the script via the Dts.Variables collection either by name or array index:

	strText = Dts.Variables["User::MyString"].Value.ToString;
	intCounter = (int)Dts.Variable[0].Value;
	Dts.Variables[1].Value = 2;

When accessing the collection as an array, the first element in the collection is the first ReadOnlyVariable listed in the Script Task Editor and the last element in the collection is the last ReadWriteVariable listed. ReadWriteVariables can be used to effect the control flow after task completion, by setting the constraint to evaluate an expression using the variable.

A typical use for a script task in the Control Flow is to prepare a data extract for use by a Flat File source in subsequent Data Flow task. The following example script fixes the formatting of a delimited file that should contain 8 '|' delimiters:

//additional namespaces required
using System.Text;
using System.Text.RegularExpressions;
using System.Collections.Generic;


public void Main()
{
    Encoding encoding = Encoding.Default;
    string strInFile = "c:\\temp\\test01.txt";
    StreamReader sr = new StreamReader(strInFile, encoding);
    string strOutFile = "c:\\temp\\test01out.txt";
    StreamWriter sw = new StreamWriter(strOutFile, false, encoding);
    string pattern = @"|";
    string strLine;
    while ((strLine = sr.ReadLine()) != null)
    {
        Regex rgx = new Regex(pattern);
        MatchCollection matches = rgx.Matches(strLine);
        int intMatchesFound = matches.Count;
        int intMatchesWanted = 8;
        int intMatchesToAdd = intMatchesWanted - intMatchesFound;
        if (intMatchesToAdd > 0)
        {
            for (int x = 0; x < intMatchesToAdd; x++)
            {
                strLine += "|\"\"";
            }
        }
        sw.WriteLine(strLine);
    }
    sr.Close();
    sw.Close();
}

There are several points of interest in this script:

  1. Encoding.Default is used to ensure that the encoding for the file is in 'ANSI' format, which is required if you want to write a '|' to a notepad document.
  2. System.Text.RegularExpressions is used to do string matching
  3. System.Collection.Generic is used to capture the matches per line, although we're just interested in this example in counting the matches.
  4. System.IO is used to get StreamReader and StreamWriter objects

The path to the files used in the above script are hard-coded into variables in the script, which will cause problems if they changed in the package but not updated in the script. It would have been better to define the variables in the package and access them using the Dts.Variables collection. The following script uses Dts.Variable collection to get the value for source and destination filenames used in a file copy operation:

	
public void Main()
{
	string taskName = Dts.Variables["System::TaskName"].Value.ToString();
	bool returnValue = false;
	bool overWrite = true;
	string strCopyFrom = Dts.Variable["User::SourceFile"].Value.ToString();
	Dts.Events.FireInformation(0, taskName, "Collecting File: " + strCopyFrom, "", 0, ref returnValue);
	string strCopyTo = Dts.Variables["User::TargetFile"].Value.ToString();
	
	try
	{
		System.IO.File.Copy(strCopyFrom, strCopyTo, overWrite);
		Dts.TaskResult = (int)ScriptResults.Success;
	}
	catch (FileNotFoundException e)
	{
		string messageText = e.Message.ToString();
		Dts.Events.FireWarning(1, "File Copy", messageText, "", 0);
		Dts.TaskResult = (int)ScriptResults.Failure;
	}
}

The files could also be accessed in the script via the Dts.Connections collection. Dts.Connections can also be used to access database connections and run SQL queries:

// additional namespaces required
using System.Data.SqlClient;

public void Main()
{
    string sqlQuery = "SELECT recipe_name FROM dbo.ingredients " +
	 "WHERE preparation_time = @PREPARATIONTIME " + 
	 "AND ingredient3 = @INGREDIENT";
    string myValue = string.Empty;

    // Use the Dts ADO.NET connection "Adoconn"
    SqlConnection mySqlConn = 
	(SqlConnection)(Dts.Connections["Adoconn"].AcquireConnection(null));
    mySqlConn = new SqlConnection(mySqlConn.ConnectionString);
    mySqlConn.Open();
    
    // Prepare the statement and assign parameters
    SqlCommand cmd = new SqlCommand();
    cmd.CommandText = sqlQuery;
    SqlParameter param1 = new SqlParameter("@PREPARATIONTIME", SqlDbType.Int);
    param1.Value = 30;
    cmd.Parameters.Add(param1);
    SqlParameter param2 = new SqlParameter("@INGREDIENT", SqlDbType.VarChar);
    param2.Value = "Flour";
    cmd.Parameters.Add(param2);
    cmd.Connection = mySqlConn;
    
    // Execute the query and retrieve results
    SqlDataReader reader = cmd.ExecuteReader();
    while (reader.Read())
    {
	myValue += reader.GetString(0) + " ";
    }
    reader.Close();
    mySqlConn.Close();
    mySqlConn.Dispose();
    System.Windows.Forms.MessageBox.Show("Found: " + myValue);
    Dts.TaskResult = (int)ScriptResults.Success;
}

The connection object retrieved by AcquireConnection must be cast to the appropriate type, e.g.:

  • SqlConnection
  • OleDb.Connection
  • OdbcConnection
  • OracleConnection

SqlReader is used to retrieve one row at a time from a query. The data from a query can also be retrieved into a DataSet via an SqlAdapter object. Retrieving a DataSet allows us to write the data to another data destination such as an XML file:

// additional namespaces required
using System.Data.SqlClient;
using System.Xml;

public void Main()
{
    SqlConnection sqlConn;
    string strQuery = "SELECT * FROM dbo.ingredients";
    sqlConn = (SqlConnection)(Dts.Connections["Adoconn"]).AcquireConnection(Dts.Transaction);
    sqlConn = new SqlConnection(sqlConn.ConnectionString);
    sqlConn.Open();
    SqlCommand cmd = new SqlCommand(strQuery, sqlConn);
    SqlDataAdapter da = new SqlDataAdapter(cmd);
    DataSet ds = new DataSet();
    da.Fill(ds);
    ds.WriteXml(new System.IO.StreamWriter("C:\\temp\\test.xml"));
    sqlConn.Close();

    Dts.TaskResult = (int)ScriptResults.Success;
}

The Dts.Events object can be used in a Script Task to log events:

public void Main()
{
    string taskName = Dts.Variables["System::TaskName"].Value.ToString();
    bool returnValue = false;

    Dts.Events.FireInformation(0, taskName, 
	String.Format("Starting Loop Operation at {0} ",
	DateTime.Now.ToString("dd/MM/yyyy hh:mm:ss")), "", 0, ref returnValue);

    for(int x = 0; x <= 10; x++)
    {
	Dts.Events.FireProgress(String.Format("Loop in iteration {0}", x),
	    x * 10, 0, 10, taskName, ref returnValue);
    }
    Dts.Events.FireWarning(1, taskName, "Writing a warning", "", 0);
    Dts.Events.FireWarning(2, taskName, "Writing a debugging warning", "", 0);
    Dts.Events.FireError(0, taskName, "Here's an error message", "", 0);

    // TODO: Add your code here
    Dts.TaskResult = (int)ScriptResults.Success;
}

By default events are displayed in the Execution Results tab, but they can also be logged into SSIS logging or captured by Event Handler Control Flows. Custom SSIS log files are configured from the BIDS menu 'SSIS -> Logging...'. In this dialogue, custom logfile formats can be defined and associated to package tasks and types of events to log. The advanced tab for each event allows you to specify the event properties that get logged. The Dts.Log method can also be used to write messages to all defined log files for a task:

Dts.Log(string messageText, int dataCode, byte databytes)

Analysis Services Tasks

Top Bottom

There are three tasks related to Analysis Services:

Analysis Services Execute DDL
Used to create cubes, dimensions or other OLAP objects
Processing task
Used to process Analysis Services objects
Data Mining task
Used to run queries against data-mining models and output results to other data sources

Data Flow Task

Top Bottom

Used to apply transforms to data passing from source to destination and forms the core task in ETL applications

Each Data Flow task added in the Control Flow will have its own design surface on the Data Flow tab, where the data flow can be configured with source, destination and transform tasks. Data Flow is covered in more detail on the Data Flow page.

Data Profiler Task

Top Bottom

Used to examine data and to collect meta-data about the quality of the data. Not ordinarily used for defining a control flow condition. The Data Profiler tasks produces and output file that can be viewed using the DataProfileViewer application found in %MSSQL%\100\DTS\bin. The task is configured by selecting from a variety of pre-defined profile request types.

File System Task

Top Bottom

Provides System.IO.File operations to copy, move, delete or rename files and directories. The IsDestinationPathVariable and IsSourcePathVariable properties can be used to set source and destinations to variable values.

this is the filesystemtaskcopy

FTP Task

Top Bottom

Used to perform basic FTP operations:

  • Send or Receive Files
  • Create local or remote directories
  • Remove local or remote directories
  • Delete local or remote files

Both local and remote paths can be set to variables

Web Services Task

Top Bottom

Used to retrieve XML resultsets from Web Services

XML Task

Top Bottom

Used to validate, modify, extract or create files in XML format

Bulk Insert Task

Top Bottom

Wizard to create a bcp command at runtime. Allows setting option to 'Fire Triggers' for any INSERT triggers defined

Execute SQL Task

Top Bottom

Used to execute TSQL and stored procedures. Can be used at the beginning of a ETL operation, to empty, drop or create a target table.

Allows use of parameterised statements and capturing results in parameters or resultset objects.

The specification of parameters for SQL Statements vary according to the connection type:

Parameter Specification for SQL Statements
Connection Type Parameter Marker Parameter Naming
ADO ? param_name1, param_name2, ...
ADO.NET @param_name @param_name
ODBC ? 1,2,3,..
OLEDB ? 0,1,2,...
Excel ? 0,1,2,...

For exectuting stored procedures the specification of parameters varies according to the connection type and whether 'IsQueryStoredProcedure' option is set to True or False:

Parameter Specification Stored Procedures
Connection Type IsQueryStoredProcedure Value Stored Procedure Syntax
ADO False EXEC usp_myprocedure ?, ?
ADO True usp_myprocedure ?, ?
ADO.NET False EXEC usp_myprocedure @param_name1, @param_name2
ADO.NET True usp_myprocedure @param_name1, @param_name2
ODBC N/A {call usp_myprocedure (?,?)}
OLEDB N/A EXEC usp_myprocedure ?, ?
Excel N/A EXEC usp_myprocedure ?, ?

When specifying a SQL statement for an 'Execute SQL' task, the statement can be either input directly or via a variable or file connection.

For capturing a single-row resultset, you will need to define the binding of resultset columns to SSIS variables, ensuring the correct data type is assigned. Because SSIS has fewer data types, you may need to use CONVERT in your SQL Statement. With multi-row resultsets, the whole resultset must be captured in a single variable of type object or XML. The resultset can then be processed in a Foreach Loop operation or with an XML task.

Execute Package Task

Top Bottom

Allows you to build parent packages that execute other, child, packages, allowing a particular package to be broken down into discrete functional units.

The ExecuteOutOfProcess property can be used to execute a child package in its own memory space

Child packages can access be granted access to variables in the parent package by running the Package Configurations Wizard, accessed from the Control Flow context menu under SSIS > Package Configuration. Once the target and source variables have been defined in the wizard, the parent variable can be accessed or changed in the child. For instance to change a parent variable 'ParentVar' from a child package that stores the value in 'ChildVar' in C#.NET use:

public void Main() 
{
  Dts.Variables("ParentVar").Value = "CHILD VALUE";
  Dts.TaskResult = (int)ScriptResults.Success;
}

And the same thing in VB.NET:

Public Sub Main() 
  Dts.Variable("ParentVar").Value = "CHILD VALUE"
  Dts.TaskResult = ScriptResults.Success
End Sub

Execute Process Task

Top Bottom

Used to execute operating system executables. The properties for the task include:

  • executable
  • arguments
  • working directory
  • StandardInputVariable
  • StandardOutputVariable
  • StandardErrorVariable

Message Queue Task

Top Bottom

Used to send or receive messages via MSMQ service, typically used in Service Broker applications.

Send Mail Task

Top Bottom

Used to send e-mail via an SMTP Connection Manager

WMI Reader Task

Top Bottom

Used to run WQL queries against a server's WMI Database. To check the freespace on drive d: of the current server the WQL query can be set to:

SELECT FreeSpace, DeviceID, Size, SystemName, Description
FROM Win32_LogicalDisk 
WHERE DeviceID = 'C:'

WMI Event Watcher Task

Top Bottom

Used in Control Flow to force package to wait for a defined WMI event to occur before processing continues. WQL queries are used to define the event to wait for. For instance to wait for the appearance of a file in a specified directory:

SELECT * FROM __InstanceCreationEvent 
WITHIN 10
WHERE TargetInstance ISA "CIM_DirectoryContainsFile"
AND TargetInstanceGroupComment = 
	"Win32_Directory.Name = \"C:\\\\incoming\" "

The WMI event is not fired until the complete file has been written

SQL Server SMO Tasks

Top Bottom

SMO tasks are tasks related to SQL Server Management Objects and are used to transfer SMOs from server to server. Tasks provided are as follows:

  • Transfer Database
  • Transfer Error Message
  • Transfer Logins
  • Transfer Master Stored Procedures
  • Transfer Jobs
  • Transfer SQL Server Objects