Saturday, May 30, 2015

Implement transactions in SSIS.

  • You can set package transactions at the entire package level or at any control flow container level or task level.
  • Transactions in SSIS use the Microsoft Distributed Transaction Coordinator(MSDTC); the MSDTC service needs to be started on the computer for transactions to work. 
  • To enable a transaction within a package, you need to set the TransactionOption property of the task or container to Required. The TransactionOption property exists at the package level and at the container level, as well as for almost any control flow task. 

It can be set to one of the following:
  • Required : If a transaction already exists, join it; if not, start a new transaction.
  • Supported : If a transaction exists, join it (this is the default setting).
  • NotSupported : The package, container, or task should not join an existing transaction.

If a series of tasks must be completed as a single unit in which either all the tasks are successful and committed or an error occurs and none of the tasks are committed,place the tasks within a sequence container and set the TransactionOption property of the container to Required.

Create the simple package.


























This package is quite basic in that all it does is insert some data into the table
and then the last task will deliberately fail. Open SSMS and run the following code.

CREATE TABLE Transaction_test
(
id int
)

In the first task, insert some data into the table you just created with the following code:

Insert into Transaction_test(id) values (1)

To make the final task fail at runtime, enter the following code:

Insert into Transaction_test(id) values ("A")

Run the package with no transactions in place and see what happens.

























The first task succeeds, and the second fails.If you go to your database, you should see data inserted.





















Now run the following code in SSMS to delete the data from the table:

Truncate table Transaction_test

Next, you want to set up the package to start a transaction that the tasks can join.You do that by
setting the properties of the package.Set the TransactionOption property to Required.



You now need to tell the tasks in the package to join this transaction, by setting their
TransactionOption properties to Supported.



























Now when you re-execute the package, a DTC transaction will be started by the package, all the
tasks will join, and because of the failure in the last task, the work in the package will be undone.
Go back to SSMS and query the table. You should see no data in it.

Tuesday, May 26, 2015

Deploying SSIS Packages

SQL Server 2012 Integration Services introduces a new approach for deploying SSIS projects, known as Project Deployment Model.

The project deployment model allows each SSIS project to be treated as a single unit, to be configured and deployed as a whole. A project can contain one or more packages, one or more parameters, and one or more connection managers.Under the project deployment model all of these are deployed to the destination SSIS server as a single unit.

Deployed SSIS projects are stored in the SSISDB catalog.The SSISDB catalog is used to store SSIS project and package definitions, as well as other SSIS metadata.

The key features of this new deployment model are the Integration Services catalog, environments, and parameters. The new deployment model is designed to make deployment and administration of SSIS packages and ETL systems easier across multiple environments.

In SQL Server 2005, 2008 you would deploy packages into SQL Server’s msdb database or to the filesystem. Now, in SQL Server 2012’s project deployment model, you deploy to the SSIS catalog.

An SSIS package created in an earlier version of SQL Server cannot be deployed directly to the SSIS server; it first needs to be converted by the SQL Server Integration Services Project Conversion Wizard to compile it into a deployment file, which can then be deployed to the SSISDB catalog.

Setting the SSIS Catalog

When you first install Integration Services, the SSIS catalog will need to be created. Create it manually by right-clicking on the Integration Services folder in the Database Engine in SSMS and selecting Create Catalog.




















A create catalog wizard,  guides you through the process of creating the catalog.you must also have the Common Language Runtime (CLR) integration turned on by checking the Enable CLR box.The default name of the database is SSISDB and cannot be renamed.The Create Catalog prompt will ask for a password. This password creates a database master key, and the master key encrypts sensitive data in the catalog.Click OK to create the database and catalog.
  • Enable CLR Integration  -  Checked
  • Enable Automatic Execution Of Integration Services Stored Procedure At SQL Server Startup  - Unchecked
  • Name Of The Catalog Database  -  SSISDB
  • Password  - Provide your password.











Saturday, May 23, 2015

New in SQL Server Integration Services 2012

1.Data Taps
Data taps are a new feature in SQL Server 2012 for troubleshooting package execution. You can capture a copy of the data from a specific data path in your data flow tasks in a comma-delimited file at run time.Much like data viewers in design time.Data taps enable you to analyze data in a production environment.For more info

2.New Shared Connection Managers
In prior versions of SSIS, connection managers were scoped to individual packages, not projects. An SSIS project did not allow shared connections across packages inside the project.
This release allows you to create connection managers at the project level that can shared by multiple packages in the project. The connection manager you create at the project level is automatically visible in the Connection Managers tab of the SSIS Designer window for all packages.

3.SSIS Parameters at the Project, Package, and Task Level
Integration Services (SSIS) parameters allow you to assign values to properties within packages at the time of package execution. You can create project parameters at the project level and package parameters at the package level. 

4.UNDO Feature.
In SQL Server Integration services 2012, you can undo and redo up to 20 actions in the SSIS Designer.Undo and redo is not possible in prior versions of SSIS.

5.Support for ODBC Data Sources and Destinations.
This is a new component in SSIS 2012.This connects to a defined ODBC source/destination by using native ODBC.In previous versions we use OdbcDataProvider in ADO.NET.

6.New Deployment Feature
Project Deployment model is the new deployment model in SSIS 2012.Which uses its own designated database called SSISDB.In package deployment model a package is the single unit but in project deployment model deploy the whole project to the SSIS catalog, 

7.New Management Dashboard Feature
This report shows activity on the package execution.

Sunday, May 17, 2015

Implementing SSIS Environment Variables

Step 1 : Create Parameters in SSDT

Parameters can be defined at the package level and at the project level. Package-level parameters are visible only to tasks and components within that package. Package parameters are defined in the $Package namespace. Parameters defined at the project level are global—all packages within the project are able to make use of them. Project parameters are defined in the $Project namespace.

In this example i have created two package level parameters and marked as required,as shown in the below image.









Parameters that are marked as Required must have their values specified at runtime. 
Required parameters will not use the design time value when the package runs—a new value must be specified by the caller 
If a parameter's required property is set to false,its design time value will be used if no other value is supplied in run time.

If a parameter is marked as Sensitive, its value will be stored in an encrypted format within the package


Step 2: Parameterize Initial catalog.

In this example i have parameterized the Initial catalog.




























You can also configure Connection string,user name,server name etc













I set the parameter to be Required ,must specify their values at runtime.

After parameterizing the connection manager, save the SSIS solution, and then open
the Parameters tab of the SSIS package pane.You can see the above created parameter in the parameter tab.



Step 3: Deploying Packages to the Package Catalog

Now let’s deploy the project to our SQL Server instance.
The Deployment Wizard is the easiest way to deploy packages to the catalog. You can access
the wizard through SSDT by right-clicking a project and selecting Deploy. 





















After finishing the deployment you can see the deployed project under SSIS DB catalog.












  











Step 4: Create Environments and add variables to it.

Environments allow you to set values to all parameters within projects through their variables at runtime. These environment variables can also be used to configure connection managers so that data can be sourced and loaded according to the simulation you need to run.

To create an environment, navigate to the Environments subfolder where you deployed your project. Right-click the subfolder and select Create Environment.  
I created two environments 'Dev' and 'Prod'



























After creating the environment, you can add variables to it.



























You can also copy the name and value for each parameter from the SSDT parameters window into the environment variable window.























The environment variables can be mapped to the different project and package parameters that are defined.
For this navigating to the project in the Integration Services catalog.
right clicking it,and selecting Configure.


























This page allows you to define the different environments that can pass values to
the project during execution.
Add environments in environment reference tab.

























After the environment reference is set up, you can map project parameter to variables within the environment. 
























Parameters tab allows you to map values of all the parameters contained within the project to the environment variables values. 



Tuesday, May 12, 2015

Data Warehouse vs Data Mart

Typical data warehouses have these characteristics:
  • Holds multiple subject areas
  • Holds very detailed information
  • Works to integrate all data sources
  • Does not necessarily use a dimensional model but feeds dimensional models.

The data mart is a subset of the data warehouse which is usually oriented to a specific business line or team (Finance has their data mart, marketing has theirs, sales has theirs and so on).

Wednesday, May 6, 2015

SSIS Topics

Server Modes of an Analysis Services Instance

Analysis Services can be installed in one of three server modes:
  • Multidimensional and Data Mining (default)
  • PowerPivot for SharePoint
  • Tabular. 
The easiest way to determine the server mode of Analysis Services instance of SQL Server 2012 is to connect to the server in SQL Server Management Studio and note the icon next to the server name in Object Explorer.














The server mode of an Analysis Services instance is determined during setup when you choose options for installing the server.

Tuesday, May 5, 2015

Data taps in SSIS

  • Data taps are a new feature in SQL Server 2012 for troubleshooting package execution. 
  • You can capture a copy of the data from a specific data path in your data flow tasks in a comma-delimited file at run time.
  • Much like data viewers in design time.
  • Data taps enable you to analyze data in a production environment.

Information needed for Data tap


Folder Name : The Name of the folder in SSIS server.


Project Name : The name of the SSIS project that holds the package

Package Name : The package name.





















Data Flow Path ID : The value of the IdentificationString property of the data flow path.







Data flow task path : The path of the data flow task within the package.This can be read from the PackagePath property of the selected data flow task in SSDT.








































The data files will be created in installation drive.The path is 'C:\Program Files\Microsoft SQL Server\110\DTS\DataDumps'.
I have specified the data file name as output.csv










 









We need to execute three below stored procedures aganist SSISDB
1.catalog.create_execution
  Create an execution instance of a package.

2.catalog.add_data_tap
  Add a data tap.

3.catalog.start_execution
  Start the execution instance of the package.

Sample script

Declare @execid bigint

EXEC [SSISDB].[catalog].[create_execution]
@folder_name=N'Advance SSIS',
@project_name=N'Advanced SSIS',
@package_name=N'ForEachLoop Container.dtsx',
@execution_id=@execid OUTPUT

EXEC [SSISDB].[catalog].add_data_tap @execution_id = @execid,
@task_package_path = '\Package\Load Flat Files',
@dataflow_path_id_string = 'Paths[File Extract.Flat File Source Output]',
@data_filename = 'output.txt'

EXEC [SSISDB].[catalog].[start_execution] @execid





Foreach Loop Container in SSIS

Loops through a series of fi les or records in a data set, and then executes the tasks in the container for each record in the collection.

Enumerator
An enumerator specifies the collection of objects that the container will loop through.
The Foreach loop container supports the following enumerator.
1.Foreach File Enumerator : Performs an action for each file in a directory with a given file extension
2.Foreach Item Enumerator: Loops through a list of items that are set manually in the container
3.Foreach ADO Enumerator : Loops through a list of tables or rows in a table from an ADO recordset
4.Foreach ADO.NET Schema Rowset Enumerator : Loops through an ADO.NET schema
5.Foreach From Variable Enumerator : Loops through a SQL Server Integration Services (SSIS) variable
6.Foreach Nodelist Enumerator : Loops through a node list in an XML document
7.Foreach SMO Enumerator : Enumerates a list of SQL Management Objects (SMO)

Foreach Loop container commonly Used for executing a set of operations repeatedly based on an enumerable collection of items (such as files in a folder, a set of rows in a table, or an array of items).

TRY IT

Loop through a collection of flat files in a directory and load them to a SQL table

Configuration.
  • Specify the path of the files directory.
  • Specify what kind of file extension the file have.(eg. for text files '*.txt').
  • Create a variable 'FileName' to store the location of the file name while looping.
  • Use the above created variable as an expression for the  file connection manager that points to the flat file.
Because the flat files each have a different name, you can use the Variable
Mappings tab in Foreach loop editor to dynamically change a variable value for each iteration of the loop. That variable then can pass the correct file name to the Flat File Connection with an expression.

Steps :
1.Create a new package
2.Drag a Data Flow Task onto your designer
3.Create a new Flat File Connection Manager,and point it to the first file. Also, check the
Column names in the first data row option and go to the Columns page to ensure all the columns
are defined properly; then click OK.
4.In the Data Flow, bring a new Flat File Source. Open the Flat
File Source Editor by double-clicking the Flat File Source and make the connection manager
the newly created one. Then click OK.
5.Next, create an OLE DB Connection Manager, using the AdventureWorks2012 database.
6.Bring an OLE DB Destination in the Data Flow and connect the Data Flow path
from the source to it.
7.Drag a Foreach Loop Container.
8.Place the Data Flow Task inside the Foreach Loop Container; then open the Foreach Loop
Editor by double-clicking the top banner portion of the container. On the Collections tab,
select Foreach File Enumerator from the Enumerator property drop-down box. The Foreach
File Enumerator is the default when you open the editor.
9.Set the folder property.for bring back all text files in the directory set Files Property to *.txt.
10.Create a new variable.
11.in the variable mapping tab,select the variable name.
Each time the container runs the variable value will change to the current file.
12.The last step is to put an expression on the File Connection Manager that uses
the variable you just created inside the Foreach Loop Container. Select the connection managers and press F4 to bring up the Properties window. Click the ellipsis next to the Expressions property to open
the Property Expressions Editor. Select ConnectionString from the Property drop-down and
then click the ellipsis in the Expression box.
13.In the top left of the Expression Builder, expand the Variables and Parameters folder and
drag the variable down into the Expression box. If you try to click Evaluate
Expression now, there will be no result. Remember that this expression will be populated at
run time, so you will see nothing here yet.
14.The package is now complete.A succesful run will loop through all the files and load to the table.

Forloop container and Foreachloop conatainer
For Loop iterates through the content of the container a number of times you define or you define with an expression, whereas the Foreach Loop Container iterates for every member of the enumerator.If there are no members, the container will not execute.

Monday, May 4, 2015

Implementing checkpoints in SSIS

You can configure a package to start from the point of failure when the package is rerun. In SSIS, this configuration process is called checkpoints.
  • Restart failed packages from the point of failure.
Some of the uses of Checkpoints are 

1.Avoid repeating the downloading and uploading of large files. For example, a package that downloads multiple large files by using an FTP task for each download can be restarted after the downloading of  a single file fails and then download only that file.

2.Avoid repeating the loading of large amounts of data. For example, a package that performs bulk inserts into dimension tables in a data warehouse using a different Bulk Insert task for each dimension can be restarted if the insertion fails for one dimension table, and only that dimension will be reloaded.

3.Avoid repeating the aggregation of values. For example, a package that computes many aggregates, such as averages and sums, using a separate Data Flow task to perform each aggregation, can be restarted after computing an aggregation fails and only that aggregation will be recomputed.

Checkpoint file
  • Checkpoint file stores the information about package execution.
  • The current values of variables are also captured in the checkpoint file. 
  • Integration Services captures the restart point in the checkpoint file.

Configuring Checkpoints

To implement checkpoints in your package, you must configure several properties at the package level:
  • CheckpointFileName: 
              Specifies the full path and filename of your checkpoint file.
  • CheckpointUsage: 
             Specifies when to use checkpoints. The property supports the following three options:
1.Never: A checkpoint file is not used.
2.IfExists: A checkpoint file is used if one exists.This option is the one most commonly used if enabling checkpoints on a file.
3.Always: A checkpoint file must always be used. If a file doesn't exist, the package fails.
  • SaveCheckpoints: 
            Specifies whether the package saves checkpoints. Set to True to enable checkpoints on the package.
  • FailPackageOnFailure property to True

Properties
1.CheckpointFileName
2.CheckpointUsage
3.SaveCheckpoints
4.FailPackageOnFailure