Tuesday, 31 May 2016

SSIS – Folder Structures

The Folder Structure

The SSIS folder structure is usually contained under a single root directory.  For simplicity, I often name this folder SSISYeah, I’m creative.  Off of the root folder, I’ll add folders for Packages and WorkFiles.
The Packages folder often contains many folders underneath it.  These folders are for all of the SSIS projects that have been deployed.  For every project there will be a Project folder.  The Project folder may have other folder beneath them, such as a Configurations folder.
The WorkFiles folder will be similar to the Packages folder in the sense that there will be Project folders below it.  TheProject folders will have other folders such as ArchiveLogsResponse, or Temporary.
If you are good at visualizing, the folder structure will look like this diagram:

The Folder Purpose

Now that this has been described, less put some definition and understanding behind these folders.
  • Packages: Everything under the Packages folder is considered “code”.  Because of this status these folders are under change management control.  People who are not the gatekeepers should only be able to browse these folders, if that.  These folders should be backed up on a regular basis.
  • Project (under Packages): These folders contain all of the SSIS package files (dtsx) and configuration files (dtsconfig) for each deployed project.  A project could be a single SSIS project or a collections of projects based on a client’s business.  Typically, I store SSIS packages outside of the MSDB database in the file system.  This has worked best for change management and with security in the environments.
  • WorkFiles: Everything under the WorkFiles folder is considered temporary.  These folders will store files to be imported, exported, or log files.  Business users are often granted access to these folders depending on their projects and whether they need to pick-up, drop-off, or review files.
  • Project (under WorkFiles): These folders contain project, same project definition from above, related files.  They can be logs, data received for import, or data exported.
  • Archive (Workfiles –> Project): After a file has been imported by an SSIS package, the imported file is dated and placed in this folder.
  • Logs (Workfiles –> Project): Log files generated from the SSIS package and the SQL Agent job steps are placed here.  These need to be in a place where more than those running the jobs can find them.
  • Response (Workfiles –> Project): When the SSIS package has data that cannot be imported and is rejected for some reason – that data will go in this folder.
  • Temporary (Workfiles –> Project): This folder stores the raw files that are generated during an SSIS package execution.  This keeps them out of the way and is an easy place to just delete everything in the unlikely scenario that there are space constraints.

Folder Structure Wrap-Up

There are countless other folders that you could add, but this should cover a lot of the basics.  These have been most of the basics that I try to use.
This has been about how I do things.  The tips and tricks I’ve learned to keep things going and to reduce headaches.  These aren’t necessarily the right answers, but they are my answers.

SSIS - Data Flow Breakpoints

 Data Viewers Definition
Data Viewers are containers that can contain a single SSIS buffer’s work.  As the data passes through the Data Path you are able to view what is in each buffer.  As each buffer passes into the Data Viewer, the processing of the SSIS package on that Data Path is paused.  In essence, you have a breakpoint.  But not really.
There are four types of data viewers.  These are:
  • Grid: Displays selected columns display in a tabular format.
  • Histogram: Displays distribution of numeric values and works only with numeric data.
  • Scatter plot: Displays scatter plot graph works only with numeric data.  You are able to set the x-axis and y-axis.
  • Chart format: Displays a column chart of the count of discrete values in the selected column.
To be perfectly honest, I’ve only ever found a use for the Grid Data Viewer.  There may be situations that I’m unaware of that the others are of great benefit.  Unfortunately, I’ve never come across those.

Data Viewer Setup

Now setting up a Data Viewer on a Data Path is fairly simple.  First, go into the Data Flow and select the Data Path that you want to pause the data processing.  You can right-click to open the properties or double click on the Data Path.  Just like make sure your cursor is on the red “X” below.
Once you have that, the Data Flow Path Editor will open.  From this screen you can tab into the Data Viewers tab to get the options for Data Viewers.
The types of Data Viewers will appear on the left-hand side of the screen.  It’s shown below at the green arrow.  All four of the types from above will be listed.  As I mentioned, the only one I use is the Grid.  The other three have never been of use to me.  I’ve looked at them a few times and just thought they were pretty.
Before moving on, it’s advisable if you are putting these into your SSIS package to name them with something meaningful.  The blue arrow and bubble above indicate the place where the name can be typed
Now this next step is optional but I do recommend that you consider this.  When you create a Data Viewer Grid you have the option to remove some columns from the grid.  This can be extremely handy when the Data Path has more than a handful of columns.  Since you are likely adding the Data Viewer to look for specific values, you may as well only return the values that are of use.
When you are finished, select OK and you’ll return to the Data Flow.  From here you will notice that a little icon has been added to the Data Flow Path to indicate that there is one or more Data Viewers on the path.
Yes, that’s correct.  You can have more than one Data Viewer on the same Data Flow Path.  Depending on what you are troubleshooting it may be worthwhile have two or three Data Viewers that return different columns.

Data Viewer In Use

Once you have a Data Viewer in place on a package, go ahead an fire it up.  The execution of the package will change when there are Data Viewers.
When the package begins, a contained will open for each Data Viewer.  By default, these will float over the window.  If you choose, you can dock these, that is up to you.
On the container there are a few things to take notice of.  First, in the image below the yellow area is all of the data currently contained in the Data Viewer.  Within it you can highlight one row or multiple rows.
If you want to copy out the data, you need to use the Copy Data button, indicated at the red arrow.  The use of Crtl-C to copy doesn’t do a dang thing.  And that sucks.
If the data you are looking for isn’t in the current buffer, you can move the package forward to the next buffer.  You do this with the Play button located at the orange arrow.  Selecting this will clear the Data Viewer and it will populate with the next buffer of data.
Lastly, if you are done with the Data Viewer and want to stop using it you can detach it from the Data Flow Path.  This is done at the green arrow.  By detaching, the Data Viewer retains it’s current data while the package continues with processing the rest of the Data Flow.
As I mentioned, Data Viewers are a lot like Breakpoints.  Due to this, you can use the Breakpoint window to disable, enable, and delete data viewers in your project.  This can be extremely handy when you have a half dozen or more data viewers you just want to get rid of.

Data Viewers Wrap-Up

Now that we’ve gone through this, I hope you can see the power and benefit of using Data Viewers.  When you are troubleshooting an SSIS package trying to figure out where some of the data is becoming out of wack.
Above, I mentioned a couple bits where I compared Breakpoints to Data Viewers.  In a lot of regards, they function similar to one another.  There is one way, though, in which they are very different.  When a breakpoint is encountered, the entire package execution is paused until the break is cleared.  When a Data Viewer is encountered only the data in that Data Flow Path is paused.  While these can be in effect the same thing, in more complicated packages this might cause unexpected results.
Finally, you may have trouble using Data Viewers when you are processing large volumes of data.  If the row that is causing the issue is a few thousand or million rows into the data set you will likely want to do a little bit more than just add a Data Viewer.  You may need filter the data some to make the Data Viewer useful for troubleshooting.


Thank You