Posts

Showing posts from April, 2022

What are import and export column transformations?

Import Column Transformation  – The Import Column transformation reads data from files and adds the data to columns in a data flow. Using this transformation, a package can add text and images stored in separate files to a data flow. * Export Column Transformation  – The Export Column transformation reads data in a data flow and inserts the data into a file. *

What are the SSIS package protection levels?

  There are 6 different types of protection levels. Do not save sensitive – (When exporting using DTUTIL specify for protection- 0) Encrypt sensitive with user key – 1 Encrypt sensitive with password – 2 Encrypt all with password -3 Encrypt all with user key – 4 Rely on server storage Do not save sensitive:  makes the sensitive data unavailable to other users. If a different user opens the package, the sensitive information is replaced with blanks and the user must provide the sensitive information. Encrypt sensitive with user key:  Uses a key that is based on the current user profile to encrypt only the values of sensitive properties in the package. Only the same user who uses the same profile can load the package. If a different user opens the package, the sensitive information is replaced with blanks and the current user must provide new values for the sensitive data. If the user attempts to execute the package, package execution fails. Encrypt sensitive with password:...

What are the for each loop enumerators available in SSIS?

  Below are the lists of various types of enumerators provided by SSIS Foreach Loop Container: Foreach  File Enumerator : It enumerates files in a folder. The plus point here is it can traverse through subfolders also. Foreach  Item Enumerator : It enumerates items in a collection. Like enumerating rows and columns in an Excel sheet. Foreach  ADO Enumerator : Useful for enumerating rows in tables. Foreach  ADO.NET Schema  Rowset  Enumerator : To enumerate through schema information about a data source. For example, to get list of tables in a database. Foreach   From  Variable Enumerator : Used to enumerate through the object contained in a variable. (if the object is enumerable) Foreach   NodeList  Enumerator : Used to enumerate the result set of an XML Path Language (XPath) expression. Foreach  SMO Enumerator : It enumerates through SQL Server Management Objects (SMO) objects.

Does using “RowCount” transformation affects the package performance?

  Rowcount component is a synchronous component and it doesn’t actually do anything particularly resource intensive means the performance degradation of your package should be negligible. We do use this component to capture the number of inserts, deletes and updates from each data-flow and then using “OnPost Execute” event this information would be written to a SQL Server table.

What is the property “RunInOptimized”? How to set this property?

  If this property is set to true then the SSIS engine ignore the unused/unmapped columns. Means it does not allocate memory to store data for those columns. At the compilation phase itself SSIS engine identifies what are the columns from source are using across the package, if it finds any columns are neither using nor mapping to destination, it simply ignores all those columns. We can set this property at two levels “Project Level” and “Package Level”. Project Level: From project properties → Debugging → RunIn*****. By default “FALSE” Package Level: Can find in DataFlow properties. By default “TRUE”

What are the fast load options available in SSIS?

  The OLE DB Destination provides more than one way to load data in the destination (5 types of Data Access Mode).  Use Fast Load option while loading data into the destination. Data Access Mode  – It allows to define the method to upload data into the destination. The fast load option will use  BULK INSERT  statement instead of  INSERT  statement. If the fast load option is not selected then by default INSERT is used. Keep Identity  – If selected, the identity values of source are preserved and the same are uploaded into the destination table. Else destination table will create its own identity values if there is any column of identity type. Keep Nulls  – If selected, the null values of the source are preserved and are uploaded into the destination table. Else if any column has default constraint defined at destination table and NULL value is coming from the source for that column then in that case, default value will be inserted into the de...

Can you name 5 or more of the native SSIS connection managers?

  OLEDB connection  – Used to connect to any data source requiring an OLEDB connection (i.e., SQL Server) Flat file connection –  Used to make a connection to a single file in the File System. Required for reading information from a File System flat file ADO.Net connection –  Uses the .Net Provider to make a connection to SQL Server 2005 or other connection exposed through managed code (like C#) in a custom task Analysis Services connection –  Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task File connection –  Used to reference a file or folder. The options are to either use or create a file or folder Excel FTP HTTP MSMQ SMO SMTP SQL Mobile WMI

Can you explain the SQL Server Integration Services functionality in Management Studio?

  You have the ability to do the following: Login to the SQL Server Integration Services instance View the SSIS log View the packages that are currently running on that instance Browse the packages stored in MSDB or the file system Import or export packages Delete packages Run packages

What are the command line tools to execute SQL Server Integration Services packages?

  DTSEXECUI –   When this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to execute an SSIS package. DTEXEC –  This is a pure command line tool where all of the needed switches must be passed into the command for successful execution of the SSIS package.

What is the Execution Tree?

  Execution trees demonstrate how package uses buffers and threads. At run time, the data flow engine breaks down Data Flow task operations into execution trees. These execution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and may execute on a different thread. When a new buffer is created such as when a partially blocking or blocking transformation is added to the pipeline, additional memory is required to handle the data transformation and each new tree may also give you an additional worker thread.

How checkpoint works in for loop?

  For each Loop, the container is another atomic unit of work that can be restarted. However, the checkpoint file does not contain information about the work completed by the child containers, and the for each Loop container and its child containers run again when the package restarts.

How to send an SSRS report from SSIS?

Often there is a requirement to be able to send an SSRS report in Excel, PDF or another format to different users from an SSIS package once it has finished performing a data load. In order to do this, first, you need to create a subscription to the report. You can create an SSRS report subscription from Report Manager. At the report subscription, you can mention the report format and the email address of the recipient. When you create a schedule for the SSRS report, a SQL Server Agent Job will be created. From the SSIS, by using sp_start_job and passing the relevant job name you can execute the SSRS report subscription.

What is the data flow buffer?

SSIS operates using buffers which is kind of an in-memory virtual table to hold data. We are able to set max rows/size but this gets more interesting with blocking transformations

Explain why variables called the most powerful component of SSIS?

    Variable allows us to dynamically control the package at runtime. Example: You have some custom code or script that determines the query parameter’s value. Now, we cannot have a fixed value for the query parameter. In such scenarios, we can use variables and refer the variable to the query parameter. We can use variables for like: Updating the properties at runtime, Populating the query parameter value at runtime, used in a script task, Error handling logic and with various looping logic.

Error Handling in SSIS?

  An error handler allows us to create flows to handle errors in the package is quite an easy way. Through event handler tab, we can name the event on which we want to handle errors and the task that needs to be performed when such an error arises. We can also add sending mail functionality in event of any error through SMTP Task in the Event handler. This is quite useful in event of any failure in office non-working hours. In the Data flow, we can handle errors for each connection through following failure path or red arrow.

What is the Data Path and how is it different from a Precedence Constraint?

  Data Path is used in a Data Flow task to connect to different components of a Data Flow and show the transition of the data from one component to another. A data path contains the Meta information of the data flowing through it, such as the columns, data type, size, etc. When we talk about differences between the data path and precedence constraint; the data path is used in the data flow, which shows the flow of data. Whereas the precedence constraint is used in control flow, which shows control flow or transition from one task to another task

What is the Retain Same Connection property and what is its impact?

   Whenever a task uses a connection manager to connect to the source or destination database, a connection is opened and closed with the execution of that task. Sometimes you might need to open a connection, execute multiple tasks and close it at the end of the execution. This is where Retain Same Connection property of the connection manager might help you. When you set this property to TRUE, the connection will be opened on the first time it is used and remain open until execution of the package completes.

Explain Union all Transformation?

  It works in the opposite way to merge transformation. It can take the output from more than 2 input paths and combines into a single output path.

Explain Sort Transformation?

  This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.

Explain the Percentage and row sampling Transformations?

  This transformation will take data from the source and randomly sampling data. It gives you 2 outputs. First is selected data and the second one is unselected data. It is used in a situation where you train a data mining model. These two are used to take the SAMPLE of data from the input data.

What is the Difference between Execute TSQL Task and Execute SQL Task?

  In SSIS there is one task Execute TSQL task which is similar to Execute SQL task. We will see what is the difference between the two. Execute the TSQL Task:   Pros:  Takes less memory, faster performance Cons:  Output into variable not supported, Only supports ADO.net connection Execute SQL Task: Pros:  Support output into variables and multiple types of connection, parameterized query possible. Cons:  Takes more memory, slower performance compared to the TSQL task.

What is a transaction in the SSIS package and how to implement it?

  Packages use transactions to bind the database actions that tasks perform into atomic units. and by doing this maintain data integrity.  Al MS IS container types – packages the For loop, For each loop, and Sequence containers, and the task hosts that encapsulate each task can be configured to use transactions. IS provides three options for configuring transactions: Not supported, Supported, and Required. Require  indicates that the container starts a transaction unless one is already started by its parent container. if a transaction already exists, the container joins the transaction, For example, if a package that is not configured to support transactions includes a Sequence container that uses the Required option, the Sequence Container would start its own transaction. If the package were configured to use the Required option, the Sequence container would join the package transaction. Supported  indicates that the container does not start a transaction, but joins...

Execute package task?

  The Execute Package Task enables you to build SSIS solutions called parent packages that execute other packages called “Child Packages”.  You’ll find this capability an indispensable part of your SSIS development as your packages begin to grow. Separating packages into discrete functional workflows makes for shorter development and testing cycles and facilitates best development practices. in SSIS, the child package is aware of the parent package parameters and can reach out and access those parameters–level configuration values.  The majority of the configurable properties are in the Package tab of the Executable package Task Editor. The first option is to provide the location of the child package.  The options here are either the File system and SQL Server.  You can deploy the SSIS package in the FIle system task as a .dtsx file or within the msdb database of a SQL Server instance.  if you select a file system, you must first create a new connection man...

OLE DB Command Transformation?

  The OLE DB Command Transform is a component designed to execute a SQL Statement for each row in an input stream. This task is analogous to an ADO Command Object being created, prepared, and executed for each row of a result set. The input stream provides that data for parameters that can be set into the SQL Statement that is either an Inline statement or a stored procedure call.

What is the multicast Transformation in SSIS?

  The Multicast transform, as the name implies, can send single data input to multiple output paths easily. You may want to use this transformation to send a path to multiple destinations sliced in different ways.  The multicast transformation is similar to the Split Transformation because both send data to multiple outputs. However, you can not specify the conditions for which part of the data will be in which output in the Multicast transformation.

What is the Data Profiling task?

  Data profiling is the process of analyzing the source data to better understand what condition the data is in, in terms of cleanliness, patterns, numbers or nulls, and so on. data profiling tasks usually be used at the beginning of the development cycle to support the design of the destination database schema. Note that this task is not used when you develop the normal recurring ETL packages.

Explain the importance of config files in SSIS

  The config file is used for providing inputs to the connection manager about the properties used by the packages and tasks at the run-time. This is especially useful when the changes are deployed on multiple locations or servers as we do not have to worry every time package runs about configuration. Depending on the server or location, the config files are automatically picked and used. We can store config properties in multiple ways. Some of them are: XML configuration file:  Data can be stored as an XML file. Environment variables:  The config data can be stores as part of environment variables. Registry entry:  The config can be stored in the registry. Parent package variable:  The config can be stored as a variable in a package of a task. SQL Server:  The config can be stored in a database table on the SQL Server.

What is SSIS breakpoint?

Image
  An SSIS breakpoint is a property that allows developers to debug and review the variable or data status at any point by pausing the execution of the package in the project. The breakpoints are configured in Business Intelligence Development Studio, generally known as BIDS. The following steps are followed to apply or remove SSIS breakpoints: Go to the control flow interface within the BIDS section. Right-click on the object that we want to apply or remove the breakpoint. Select “Edit Breakpoint” and enable/disable the breakpoint.

How is the deployment utility created in SSIS?

  Deployment is nothing but a process of updating the state of the package from development mode to executable mode. Deployment in SSIS can be done by simply performing the right-click operation on the “Integration Services Project” and clicking on the   build   option. This would create a “package.dtsx” file inside the bin folder named “projectbin”. The utility helps in deploying packages at SQL server or as an executable file on a specified location. Following are the steps followed to create a deployment utility: Go to Project and right-click on it. Double click on the  properties . Choose the path location for deployment by selecting the box next to “True” under Create Deployment Utility. Save changes. Close this window. Right-click again on the project and select the  build  option. This creates a deployment folder in the project’s root within the BIN folder. The deployment folder will have a .manifest file. Double-clicking on the .manifest file and se...

Difference between Merge Transformation and Union all transformations.

Image
  Merge Transformation : This does the task of merging data from 2 sources to single output. This is useful when we need to break the data flow path to a separate error path and once the error is handled, merge that back to the main data flow. To apply this transformation, the data should be sorted first and the metadata information for the 2 paths should be the same. The sorting can be done by applying sort transformation before the merge or by providing an ORDER BY clause in the source connection. Union All Transformation : Union all transformation working is similar to merge transformation but the only difference is that this does not require sorted data. Data from different sources are transformed and combined into a single output.

What is Checkpoint in SSIS

Image
  SSIS checkpoint is a property that provides a point of restart from the failure point. When the checkpoint property is set to true, a checkpoint file is created which consists of all the relevant information required to run the package from the point of failure. If the package is run successfully, the file is deleted and is recreated again when the package runs. This is particularly useful as it avoids re-running the whole project when the execution fails.

What is Variable? Explain Different types of Variables in SSIS

  SSIS permits the creation of two variable types-  global  and  task-specific  variables. The scope of global variables is available for all tasks of a specified process. The task-specific variables have scope specific to the task.

Define SSIS Catalog. Is it possible to deploy user-defined packages in the catalog?

  SSIS catalog is the database where all deployed packages are stored securely. This helps to control and handle the deployed packages effectively. The users can deploy their packages too. All the deployed packages are stored in the catalog as a centralized database.

What do you understand by SSIS expressions?

  SSIS Expressions are used for filtering information based on parameters and conditions to get desired information. They make lives easy by helping to work with conditions, loops, and dynamic connections. The expression could consist of symbols, literals, identifiers, operators, and functions. They are mainly used for updating properties dynamically at runtime. Once we have the expressions ready, an expression evaluator parses this expression for validating the rules of the expression. The expressions are used for the below 2 cases: Control Flow: Here, the variable values are derived using other variables employing expressions. For validating any condition, precedence constraints are used. Used for setting properties dynamically at runtime by tasks and containers. Data Flow: Expressions are used on rows to get conditional split transformations evaluating to true or false. Helps to create a new column based on the result of applying expressions to get derived transformations on new...