Tuesday, October 9, 2012

BIML Transfer DataBase Objects

Quick demo on moving tables and other object between MS-SQL environments:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Annotations>
        <Annotation>
            File: Transfer.DataBase.Objects.biml
            Demo on transferring database objects
            between MSSQL environments
            Enviroment:
            DB: MS-SQL2012
            BIML: 1.6 VS2010 BIDS Helper
            (c) John Minkjan biml101.blogspot.com
        </Annotation>
    </Annotations>
    <Connections>
        <SqlServerManagementConnection
            Name="SSMCLocalHostMSSQL2012"
            SqlServerName=".\MSSQL2012"
            UseWindowsAuthentication="true"></SqlServerManagementConnection>
            </Connections>
    <Packages>
        <Package Name ="Transfer.DataBase.Objects" ConstraintMode="Linear">
            <Tasks>
                <TransferSqlServerObjects
                    Name ="TransferSqlServerObjects"
                    SourceConnectionName="SSMCLocalHostMSSQL2012"
                    SourceDatabase="AdventureWorks2012"
                    DestinationConnectionName="SSMCLocalHostMSSQL2012"
                    DestinationDatabase="AdventureWorksDW2012"
                    DropObjectsFirst="true"                   
                    CopyData="true"
                    CopyAllTables="false">
                    <Annotations>
                        <Annotation>
                            Notation for tables:
                            [schema].[TableName]
                            (including bracktes}
                            Target schema must exist!
                        </Annotation>
                    </Annotations>
                    <Tables>
                        <Table>[HumanResources].[Department]</Table>
                        <Table>[HumanResources].[Employee]</Table>
                        <Table>[HumanResources].[EmployeeDepartmentHistory]</Table>
                    </Tables>
                </TransferSqlServerObjects>               
                                         
            </Tasks>
        </Package>
    </Packages>
</Biml>

The biml can als be downloaded here: Transfer.SqlServerObjects.biml

This will get you:

image

Till Next Time

This article is also published on www.bimlscript.com

Saturday, October 6, 2012

BIML BigData: Reading a RSS feed

Bigdata is HOT!

Based on the twitter feed reader I made here (Biml101: Reading Twitter Feed), I made a simple RSS feed reader using BIML.

The Feed URI term is passed trough the variable RSSURI. Be aware not to poll the RSS feeder every second, it probably will exclude your IP. For most feeds once every hour is enough! The package will only retrieve “new” feed items which are not jet in your table.

The Script:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Annotations>
        <Annotation>
            File: RSS.Feed.Reader.biml
            Demo on reading a twitter feed
            Enviroment:
            DB: MS-SQL2012
            BIML: 1.6 VS2010 BIDS Helper
            (c) John Minkjan biml101.blogspot.com
        </Annotation>
        <Annotation>
            RSSFeedReader Table Create script:
           
            CREATE TABLE [dbo].[RSSFeedReader](
            [ChannelTitle] [nvarchar](255) NULL,
            [ChannelDescription] [nvarchar](2048) NULL,
            [ChannelLink] [nvarchar](255) NULL,
            [ChannelLanguage] [nvarchar](255) NULL,
            [ItemDescription] [nvarchar](2048) NULL,
            [ItemGuid] [nvarchar](128) NOT NULL,
            [ItemTitle] [nvarchar](255) NOT NULL,
            [ItemPubDate] [nvarchar](50) NOT NULL,
            [ItemLink] [nvarchar](255) NULL,
            [RSSURI] [nvarchar](255) NULL,
            CONSTRAINT [PK_RSSFeedReader] PRIMARY KEY CLUSTERED
            (
            [ItemGuid] ASC,
            [ItemTitle] ASC,
            [ItemPubDate] ASC
            )
            )
        </Annotation>
    </Annotations>
    <Connections>
        <OleDbConnection
    Name="CnOleDBAdventureWorks2012"
    ConnectionString="Data Source=.\MSSQL2012;Initial Catalog=AdventureWorks2012;Provider=SQLNCLI10.1;Integrated Security=SSPI;Connect Timeout=30;"
    RetainSameConnection="true">
        </OleDbConnection>
    </Connections>
   
    <ScriptProjects>
        <ScriptComponentProject Name="SC_RSS_READER">
            <AssemblyReferences>
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSPipelineWrap" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSRuntimeWrap" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.PipelineHost" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.TxScript" />
                <AssemblyReference AssemblyPath="System.dll" />
                <AssemblyReference AssemblyPath="System.AddIn.dll" />
                <AssemblyReference AssemblyPath="System.Data.dll" />
                <AssemblyReference AssemblyPath="System.Xml.dll" />
            </AssemblyReferences>
            <OutputBuffers>
                <OutputBuffer Name="Output0" IsSynchronous ="false">
                    <Annotations>
                        <Annotation>
                            IsSynchronous="false" ==>
                            When there are no non-synchronus buffers,
                            the compiler does not emit the CreateNewOutputRows
                            virtual base method (to match the BIDS/SSDT behavior)
                        </Annotation>
                    </Annotations>
                    <Columns>
                        <Column Name="ChannelTitle" DataType="String" Length="255"></Column>
                        <Column Name="ChannelDescription" DataType="String" Length="2048"></Column>
                        <Column Name="ChannelLink" DataType="String" Length="255"></Column>
                        <Column Name="ChannelLanguage" DataType="String" Length="255"></Column>
                        <Column Name="ItemDescription" DataType="String" Length="2048"></Column>
                        <Column Name="ItemGuid" DataType="String" Length="128"></Column>
                        <Column Name="ItemTitle" DataType="String" Length="255"></Column>
                        <Column Name="ItemPubDate" DataType="String" Length="50"></Column>
                        <Column Name="ItemLink" DataType="String" Length="255"></Column>
                        <Column Name="RSSURI" DataType="String" Length="255"></Column>
                    </Columns>                   
                </OutputBuffer>                                
            </OutputBuffers>
            <ReadOnlyVariables>               
                <Variable Namespace="User" VariableName="RSSURI" DataType="String"></Variable>
            </ReadOnlyVariables>
            <Files>
                <File Path="AssemblyInfo.cs">
                    using System.Reflection;
                    using System.Runtime.CompilerServices;
                    [assembly: AssemblyTitle("SC_RSS_READER")]
                    [assembly: AssemblyDescription("")]
                    [assembly: AssemblyConfiguration("")]
                    [assembly: AssemblyCompany("")]
                    [assembly: AssemblyProduct("SC_RSS_READER")]
                    [assembly: AssemblyCopyright("Copyright @  2012")]
                    [assembly: AssemblyTrademark("")]
                    [assembly: AssemblyCulture("")]
                    [assembly: AssemblyVersion("1.0.*")]
                </File>
                <File Path="main.cs">
                    <![CDATA[
                    using System;
                    using System.Data;
                    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
                    using Microsoft.SqlServer.Dts.Runtime.Wrapper;
                    using System.Xml;

                    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
                    public class ScriptMain : UserComponent
                    {

                    public override void CreateNewOutputRows()
                    {
                    XmlTextReader rssReader;
                    XmlDocument rssDoc;
                    XmlNode nodeRss;
                    XmlNode nodeChannel;
                    XmlNode nodeItem;

                    string RSSURI = this.Variables.RSSURI.ToString();
                    rssReader = new XmlTextReader(RSSURI);
                    rssDoc = new XmlDocument();

                    rssDoc.Load(rssReader);

                    for (int i = 0; i < rssDoc.ChildNodes.Count; i++)
                    {

                    if (rssDoc.ChildNodes[i].Name == "rss")
                    {
                    nodeRss = rssDoc.ChildNodes[i];          

                    for (int j = 0; j < nodeRss.ChildNodes.Count; j++)
                    {
                    if (nodeRss.ChildNodes[j].Name == "channel")
                    {
                    nodeChannel = nodeRss.ChildNodes[j];
                                              
                    for (int k = 0; k < nodeChannel.ChildNodes.Count; k++)
                    {

                    if (nodeChannel.ChildNodes[k].Name == "item")
                    {
                    nodeItem = nodeChannel.ChildNodes[k];
                    Output0Buffer.AddRow();
                    Output0Buffer.ChannelTitle = nodeChannel["title"].InnerText;
                    Output0Buffer.ChannelDescription = nodeChannel["description"].InnerText;
                    Output0Buffer.ChannelLink = nodeChannel["link"].InnerText;
                    Output0Buffer.ChannelLanguage = nodeChannel["language"].InnerText;
                    Output0Buffer.ItemDescription = nodeItem["description"].InnerText;
                    Output0Buffer.ItemGuid = nodeItem["guid"].InnerText;
                    Output0Buffer.ItemTitle = nodeItem["title"].InnerText;
                    Output0Buffer.ItemPubDate = nodeItem["pubDate"].InnerText;
                    Output0Buffer.ItemLink = nodeItem["link"].InnerText;
                    Output0Buffer.RSSURI = RSSURI;                              
                    }
                    }
                    }
                    }
                    }
                    }
                    }

                    }
                    ]]>
                </File>
            </Files>
        </ScriptComponentProject>
    </ScriptProjects>
    <Packages>
        <Package Name ="RSS.Feed.Reader" ConstraintMode="Linear">
            <Variables>
                <Variable Name="RSSURI" DataType="String">http://rss.cnn.com/rss/edition.rss</Variable>               
            </Variables>
            <Tasks>
                <Dataflow Name ="DFT Read RSS Feed" >
                    <Transformations>
                        <ScriptComponentSource Name="SC RSS Feed Reader">
                            <ScriptComponentProjectReference ScriptComponentProjectName="SC_RSS_READER"></ScriptComponentProjectReference>
                        </ScriptComponentSource>
                        <Lookup Name="LKP Existing ItemGuids"
                                NoMatchBehavior="RedirectRowsToNoMatchOutput"
                                OleDbConnectionName="CnOleDBAdventureWorks2012" 
                                CacheMode="Partial">
                            <ExternalTableInput Table="[dbo].[RSSFeedReader]"></ExternalTableInput>
                            <Inputs>
                                <Column SourceColumn="ItemGuid" TargetColumn="ItemGuid"></Column>
                                <Column SourceColumn="ItemPubDate" TargetColumn="ItemPubDate"></Column>
                                <Column SourceColumn="ItemTitle" TargetColumn="ItemTitle"></Column>
                            </Inputs>
                        </Lookup>                   
                        <OleDbDestination Name="ODD dbo RSSFeedReader" ConnectionName="CnOleDBAdventureWorks2012">
                            <InputPath OutputPathName="LKP Existing ItemGuids.NoMatch"></InputPath>
                            <ExternalTableOutput Table="[dbo].[RSSFeedReader]"></ExternalTableOutput>
                        </OleDbDestination>
                    </Transformations>                   
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
</Biml>

The Script can also be found here: RSS.Feed.Reader.biml

Will give you:

image

Till Next Time (also published on BIMLSCRIPT.com ==> http://bimlscript.com/Snippet/Details/59)

Friday, October 5, 2012

BIML BigData: Reading a twitter feed

Bigdata is HOT! I made a simple twitter feed reader using BIML. The search term is passed trough the variable SearchTerm. Be aware not to poll twitter every second, twitter will exclude your IP. For most search terms once every hour is enough! The package will only retrieve “new” tweets which are not jet in your table. There is a maximum of 100 tweets per run.

The script uses a ScriptComponentSource. Pay attention to the <IsSynchronous="false" ==> When there are no non-synchronus buffers, the compiler does not emit the CreateNewOutputRows virtual base method (to match the BIDS/SSDT behavior)

Kudos to Scott Curie of Varigence for helping me take this hurdle!

Script:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Annotations>
        <Annotation>
            File: Twitter.Reader.biml
            Demo on reading a twitter feed
            Enviroment:
                DB: MS-SQL2012
                BIML: 1.6 VS2010 BIDS Helper
            (c) John Minkjan biml101.blogspot.com
        </Annotation>
        <Annotation>
            Twitter Log Table Script:
           
            CREATE TABLE [dbo].[TwitterLog](
            [TweetNumber] [bigint] NULL,
            [TweetID] [nvarchar](128) NULL,
            [PublishedDateTime] [nvarchar](50) NULL,
            [UpdatedDateTime] [nvarchar](50) NULL,
            [TweetContent] [nvarchar](2048) NULL,
            [TweetTitle] [nvarchar](2048) NULL,
            [TweetURI] [nvarchar](2048) NULL,
            [TweeterName] [nvarchar](2048) NULL,
            [TweetLanguage] [nvarchar](128) NULL,
            [TweetSource] [nvarchar](128) NULL,
            [TweetResultType] [nvarchar](128) NULL,
            [TweetGeoInfo] [nvarchar](128) NULL,
            [TweeterImageLink] [nvarchar](128) NULL,
            [TweetLink] [nvarchar](128) NULL,
            [SearchTerm] [nvarchar](128) NULL
            )
        </Annotation>
    </Annotations>
    <Connections>
        <OleDbConnection
            Name="CnOleDBAdventureWorks2012"
            ConnectionString="Data Source=.\MSSQL2012;Initial Catalog=AdventureWorks2012;Provider=SQLNCLI10.1;Integrated Security=SSPI;Connect Timeout=30;"/>
    </Connections>
    <ScriptProjects>
        <ScriptComponentProject Name ="SCS_Twitter_Feed">
            <AssemblyReferences>
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSPipelineWrap" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSRuntimeWrap" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.PipelineHost" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.TxScript" />
                <AssemblyReference AssemblyPath="System.dll" />
                <AssemblyReference AssemblyPath="System.AddIn.dll" />
                <AssemblyReference AssemblyPath="System.Data.dll" />
                <AssemblyReference AssemblyPath="System.Xml.dll" />
            </AssemblyReferences>
            <OutputBuffers>
                <OutputBuffer Name="Output0" IsSynchronous="false">
                    <Annotations>
                        <Annotation>
                            IsSynchronous="false" ==>
                            When there are no non-synchronus buffers,
                            the compiler does not emit the CreateNewOutputRows
                            virtual base method (to match the BIDS/SSDT behavior)
                        </Annotation>
                    </Annotations>
                    <Columns>
                        <Column Name="TweetNumber" DataType ="Int64"></Column>
                        <Column Name="TweetID" DataType="String" Length="128"></Column>
                        <Column Name="PublishedDateTime" DataType="String" Length="50"></Column>
                        <Column Name="UpdatedDateTime" DataType="String" Length="50"></Column>
                        <Column Name="TweetContent" DataType="String" Length="2048"></Column>
                        <Column Name="TweetTitle" DataType="String" Length="2048"></Column>
                        <Column Name="TweetURI" DataType="String" Length="2048"></Column>
                        <Column Name="TweeterName" DataType="String" Length="2048"></Column>
                        <Column Name="TweetLanguage" DataType="String" Length="128"></Column>
                        <Column Name="TweetSource" DataType="String" Length="128"></Column>
                        <Column Name="TweetResultType" DataType="String" Length="128"></Column>
                        <Column Name="TweetGeoInfo" DataType="String" Length="128"></Column>
                        <Column Name="TweeterImageLink" DataType="String" Length="128"></Column>
                        <Column Name="TweetLink" DataType="String" Length="128"></Column>
                    </Columns>
                </OutputBuffer>
            </OutputBuffers>
            <ReadOnlyVariables>
                <Variable Namespace="User" VariableName="LastTweetNumber" DataType="Int64"></Variable>
                <Variable Namespace="User" VariableName="SearchTerm" DataType="String"></Variable>
            </ReadOnlyVariables>
            <Files>
                <File Path="AssemblyInfo.cs">
                    using System.Reflection;
                    using System.Runtime.CompilerServices;
               
                    [assembly: AssemblyTitle("SCS_Twitter_Feed")]
                    [assembly: AssemblyDescription("")]
                    [assembly: AssemblyConfiguration("")]
                    [assembly: AssemblyCompany("")]
                    [assembly: AssemblyProduct("SCS_Twitter_Feed")]
                    [assembly: AssemblyCopyright("Copyright @  2012")]
                    [assembly: AssemblyTrademark("")]
                    [assembly: AssemblyCulture("")]                   

                    [assembly: AssemblyVersion("1.0.*")]

                </File>
                <File Path="main.cs">


                    using System;
                    using System.Data;
                    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
                    using Microsoft.SqlServer.Dts.Runtime.Wrapper;
                    using System.Xml;
                    using System.Web;
                    using System.Net;

 

                    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
                    public class ScriptMain : UserComponent
                    {

                    public override void  CreateNewOutputRows()
                    {
                    System.Xml.XmlDocument xml_doc = new System.Xml.XmlDocument();
                    string TwitterUri = "http://search.twitter.com/search.atom?q=%23"
                                        + this.Variables.SearchTerm
                                        +  <![CDATA["&rpp=100&result_type=recent&since_id="]]>
                    + this.Variables.LastTweetNumber ;
                    xml_doc = GetResponse(TwitterUri);

                    XmlNodeList child_nodes = xml_doc.GetElementsByTagName("entry");
                    string[] temp;

                    foreach (XmlNode child in child_nodes)
                    {

                    Output0Buffer.AddRow();
                    Output0Buffer.TweetID = child.ChildNodes.Item(0).InnerText;
                    temp = child.ChildNodes.Item(0).InnerText.Split(':');
                    Output0Buffer.TweetNumber = Convert.ToInt64(temp[2]);
                    Output0Buffer.PublishedDateTime= child.ChildNodes.Item(1).InnerText;
                    Output0Buffer.TweetLink = child.ChildNodes.Item(2).Attributes["href"].Value.ToString();
                    Output0Buffer.TweetTitle = child.ChildNodes.Item(3).InnerText;
                    Output0Buffer.TweetContent =child.ChildNodes.Item(4).InnerText;
                    Output0Buffer.UpdatedDateTime = child.ChildNodes.Item(5).InnerText;
                    Output0Buffer.TweeterImageLink = child.ChildNodes.Item(6).Attributes["href"].Value.ToString();
                    Output0Buffer.TweetGeoInfo = child.ChildNodes.Item(7).InnerText;
                    Output0Buffer.TweetResultType = child.ChildNodes.Item(8).ChildNodes.Item(0).InnerText;
                    Output0Buffer.TweetSource = child.ChildNodes.Item(9).InnerText;
                    Output0Buffer.TweetLanguage= child.ChildNodes.Item(10).InnerText;
                    Output0Buffer.TweeterName = child.ChildNodes.Item(11).ChildNodes.Item(0).InnerText;
                    Output0Buffer.TweetURI = child.ChildNodes.Item(11).ChildNodes.Item(1).InnerText;
                    }


                    }

                    public XmlDocument GetResponse(string uri)
                    {

                    XmlDocument doc = new XmlDocument();

                    WebRequest myRequest = WebRequest.Create(new Uri(uri));
                    IWebProxy proxy = myRequest.Proxy;
                    if (proxy != null)
                    {
                    proxy.GetProxy(myRequest.RequestUri);
                    }
                   

                    doc.Load(myRequest.GetResponse().GetResponseStream());

                    return doc;


                    }
                    }

                </File>
            </Files>

        </ScriptComponentProject>
    </ScriptProjects>


    <Packages>
        <Package Name ="Twitter.Reader" ConstraintMode="Linear">
            <Variables>
                <Variable Name ="LastTweetNumber" DataType="Int64" >0</Variable>
                <Variable Name="SearchTerm" DataType="String">SSIS</Variable>
            </Variables>
            <Tasks>
                <ExecuteSQL Name ="EST Get Last TweetNumber" ConnectionName="CnOleDBAdventureWorks2012" ResultSet="SingleRow">
                    <DirectInput> SELECT isnull(max([TweetNumber]),0) FROM [dbo].[TwitterLog] where [SearchTerm] =?    </DirectInput>
                    <Parameters>
                        <Parameter Name="0" DataType="String" Length="128" Direction ="Input" VariableName ="User.SearchTerm"></Parameter>
                    </Parameters>
                    <Results>
                        <Result Name="0" VariableName="User.LastTweetNumber"></Result>
                    </Results>
                </ExecuteSQL>
                <Dataflow Name ="DFT Get Tweets">
                    <Transformations>
                        <ScriptComponentSource Name="SCS Twitter Feed">
                            <ScriptComponentProjectReference ScriptComponentProjectName="SCS_Twitter_Feed"></ScriptComponentProjectReference>
                        </ScriptComponentSource>
                        <DerivedColumns Name="DC SearchTerm">
                            <Columns>
                                <Column Name="SearchTerm" DataType ="String" Length ="128">@[User::SearchTerm]</Column>
                            </Columns>

                        </DerivedColumns>
                        <OleDbDestination Name="ODD dbo TwitterLog" ConnectionName="CnOleDBAdventureWorks2012">
                            <ExternalTableOutput Table="[dbo].[TwitterLog]"></ExternalTableOutput>
                        </OleDbDestination>
                    </Transformations>

                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
</Biml>

Will get you:

imageimage

Till Next Time

Friday, September 28, 2012

BIML Select a DataFlow based on FileHeader

On a project the source file where delivered in one and same directory, but having a variety of headers:

imageimageimage

First I catch the the column string of the first row and lead it via a conditional split to a row count:

image

Based on the row count I run a specific dataflow task:

image

Here is a way to solve it using BIML:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Annotations>
        <Annotation>
            File: Dataflow.Select.biml
            Demo on selection a dataflow based on the file header
            Enviroment:
            DB: MS-SQL2012
            BIML: 1.6 VS2010 BIDS Helper
            (c) John Minkjan biml101.blogspot.com
        </Annotation>
    </Annotations>
    <FileFormats>
        <FlatFileFormat Name="FFF CheckFile" IsUnicode ="false" RowDelimiter ="CRLF" ColumnNamesInFirstDataRow ="false">
            <Columns>
                <Column Name="FileInfo" DataType="String" CodePage ="1252" Length="250" Delimiter="CRLF"></Column>
            </Columns>
        </FlatFileFormat>
        <FlatFileFormat Name="FFF Type1" IsUnicode ="false" RowDelimiter ="CRLF" ColumnNamesInFirstDataRow ="true" >
            <Columns>
                <Column Name="PersId" DataType="String" CodePage ="1252" Length="250" Delimiter=";"></Column>
                <Column Name="FirstName" DataType="String" CodePage ="1252" Length="250" Delimiter="CRLF"></Column>
            </Columns>
        </FlatFileFormat>
        <FlatFileFormat Name="FFF Type2" IsUnicode ="false" RowDelimiter ="CRLF" ColumnNamesInFirstDataRow ="true" >
            <Columns>
                <Column Name="PersId" DataType="String" CodePage ="1252" Length="250" Delimiter=";"></Column>
                <Column Name="LastName" DataType="String" CodePage ="1252" Length="250" Delimiter="CRLF"></Column>
            </Columns>
        </FlatFileFormat>
        <FlatFileFormat Name="FFF Type3" IsUnicode ="false" RowDelimiter ="CRLF" ColumnNamesInFirstDataRow ="true" >
            <Columns>
                <Column Name="PersId" DataType="String" CodePage ="1252" Length="250" Delimiter=";"></Column>
                <Column Name="FirstName" DataType="String" CodePage ="1252" Length="250" Delimiter=";"></Column>
                <Column Name="LastName" DataType="String" CodePage ="1252" Length="250" Delimiter="CRLF"></Column>
            </Columns>
        </FlatFileFormat>

    </FileFormats>
    <Connections>
        <FlatFileConnection Name ="FFC SourceFile" FileFormat ="FFF CheckFile" FilePath ="C:\\DEMO\\PERSON1.TXT">
            <Expressions>
                <Expression PropertyName="ConnectionString">@[User::ProcessFileName]</Expression>
            </Expressions>           
        </FlatFileConnection>
        <FlatFileConnection Name ="FFC Type1" FileFormat ="FFF Type1" FilePath ="C:\\DEMO\\PERSON1.TXT">
            <Expressions>
                <Expression PropertyName="ConnectionString">@[User::ProcessFileName]</Expression>
            </Expressions>
        </FlatFileConnection>
        <FlatFileConnection Name ="FFC Type2" FileFormat ="FFF Type2" FilePath ="C:\\DEMO\\PERSON2.TXT">
            <Expressions>
                <Expression PropertyName="ConnectionString">@[User::ProcessFileName]</Expression>
            </Expressions>
        </FlatFileConnection>
        <FlatFileConnection Name ="FFC Type3" FileFormat ="FFF Type3" FilePath ="C:\\DEMO\\PERSON3.TXT">
            <Expressions>
                <Expression PropertyName="ConnectionString">@[User::ProcessFileName]</Expression>
            </Expressions>
        </FlatFileConnection>
        <OleDbConnection   
            Name="CnOleDBAdventureWorks2012"
            ConnectionString="Data Source=.\MSSQL2012;Initial Catalog=AdventureWorks2012;Provider=SQLNCLI10.1;Integrated Security=SSPI;Connect Timeout=30;"/>
    </Connections>
    <Packages>
        <Package Name="Dataflow.Select" ConstraintMode="Linear">
            <Variables>
                <Variable Name="SourceFileLocation" DataType ="String">C:\DEMO\</Variable>
                <Variable Name="FilePrefix"    DataType="String">Person</Variable>
                <Variable Name="FileSuffix" DataType="String">txt</Variable>
                <Variable Name="ProcessFileName" DataType ="String">C:\DEMO\Person3.txt</Variable>
               
            </Variables>
            <Tasks>
                <ForEachFileLoop Name="FLC File Select" ConstraintMode ="Parallel"
                                 Folder ="C:\DEMO\" FileSpecification="Person*.txt"  >
                    <VariableMappings>
                        <VariableMapping Name="Mapping" VariableName="User.ProcessFileName" />
                    </VariableMappings>
                    <Expressions>
                        <Expression PropertyName="Directory">@[User::SourceFileLocation]</Expression>
                        <Expression PropertyName="FileSpec">@[User::FilePrefix]+"*."+ @[User::FileSuffix]</Expression>
                    </Expressions>
                    <Variables>
                        <Variable Name="RowCountType1" DataType="Int64">0</Variable>
                        <Variable Name="RowCountType2" DataType="Int64">0</Variable>
                        <Variable Name="RowCountType3" DataType="Int64">0</Variable>
                    </Variables>
                    <Tasks>
                        <Dataflow Name="DFT Get File Header">
                            <Transformations>
                                <FlatFileSource Name="FFC SourceFile" ConnectionName ="FFC SourceFile">
                                    <Columns>
                                        <Column SourceColumn="FileInfo" TargetColumn ="FileInfo"></Column>
                                    </Columns>
                                </FlatFileSource>
                                <ConditionalSplit Name="CS FileType">
                                    <OutputPaths>
                                        <OutputPath Name ="Type1">
                                            <Expression>[FileInfo] =="PersId;FirstName"</Expression>
                                        </OutputPath>
                                        <OutputPath Name ="Type2">
                                            <Expression>[FileInfo] =="PersId;LastName"</Expression>
                                        </OutputPath>
                                        <OutputPath Name ="Type3">
                                            <Expression>[FileInfo] =="PersId;FirstName;LastName"</Expression>
                                        </OutputPath>
                                    </OutputPaths>                                   
                                </ConditionalSplit>
                                <RowCount Name ="RC Type1" VariableName="User.RowCountType1">
                                    <InputPath OutputPathName="CS FileType.Type1"></InputPath>
                                </RowCount>
                                <RowCount Name ="RC Type2" VariableName="User.RowCountType2">
                                    <InputPath OutputPathName="CS FileType.Type2"></InputPath>
                                </RowCount>
                                <RowCount Name ="RC Type3" VariableName="User.RowCountType3">
                                    <InputPath OutputPathName="CS FileType.Type3"></InputPath>
                                </RowCount>
                            </Transformations>
                        </Dataflow>
                        <Dataflow Name="DFT Type1">
                            <PrecedenceConstraints>
                                <Inputs>
                                    <Input EvaluationOperation ="ExpressionAndConstraint"
                                           EvaluationValue ="Success"
                                           Expression ="@RowCountType1!=0"
                                           OutputPathName ="DFT Get File Header.Output"></Input>
                                </Inputs>
                            </PrecedenceConstraints>
                            <Transformations>
                                <FlatFileSource Name="FFC Type1" ConnectionName ="FFC Type1"></FlatFileSource>
                                <OleDbDestination Name="ODD Persons" ConnectionName="CnOleDBAdventureWorks2012">
                                    <ExternalTableOutput Table ="Persons"></ExternalTableOutput>
                                </OleDbDestination>
                            </Transformations>
                        </Dataflow>
                        <Dataflow Name="DFT Type2">
                            <PrecedenceConstraints>
                                <Inputs>
                                    <Input EvaluationOperation ="ExpressionAndConstraint"
                                           EvaluationValue ="Success"
                                           Expression ="@RowCountType2!=0"
                                           OutputPathName ="DFT Get File Header.Output"></Input>
                                </Inputs>
                            </PrecedenceConstraints>
                            <Transformations>
                                <FlatFileSource Name="FFC Type2" ConnectionName ="FFC Type2"></FlatFileSource>
                                <OleDbDestination Name="ODD Persons" ConnectionName="CnOleDBAdventureWorks2012">
                                    <ExternalTableOutput Table ="Persons"></ExternalTableOutput>
                                </OleDbDestination>
                            </Transformations>
                        </Dataflow>
                        <Dataflow Name="DFT Type3">
                            <PrecedenceConstraints>
                                <Inputs>
                                    <Input EvaluationOperation ="ExpressionAndConstraint"
                                           EvaluationValue ="Success"
                                           Expression ="@RowCountType3!=0"
                                           OutputPathName ="DFT Get File Header.Output"></Input>
                                </Inputs>
                            </PrecedenceConstraints>
                            <Transformations>
                                <FlatFileSource Name="FFC Type3" ConnectionName ="FFC Type3"></FlatFileSource>
                                <OleDbDestination Name="ODD Persons" ConnectionName="CnOleDBAdventureWorks2012">
                                    <ExternalTableOutput Table ="Persons"></ExternalTableOutput>
                                </OleDbDestination>
                            </Transformations>
                        </Dataflow>
                    </Tasks>
                </ForEachFileLoop>
            </Tasks>
           
        </Package>
    </Packages>
</Biml>

This will get you:

image

Till Next Time

Tuesday, September 25, 2012

BIML Export to FlatFile

Demo BIML on creating an export to flat file package based on the SQL-Server metadata views:
<#@ template language="C#" hostspecific="true"#>
<#@ import namespace="System.Data"#>
<#@ import namespace="System.Data.SqlClient"#>
<#@ import namespace="System.IO"#>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Annotations>
        <Annotation>
            File: Export.DataBase.Schema.biml
            Demo make a export to file package
            Enviroment:
            DB: MS-SQL2012
            BIML: 1.6 VS2010 BIDS Helper
            (c) John Minkjan biml101.blogspot.com
        </Annotation>
    </Annotations>
<#
    string CnSrcDB = "Server=.\\MSSQL2012;Initial Catalog=AdventureWorks2012;Integrated Security=SSPI;Provider=SQLNCLI11.1";
    string SrcTableCatalog = "AdventureWorks2012";
    string SrcTableSchema ="Sales";
    string TableCatalog="";
    string TableSchema ="";
    string TableName ="";
    string ColumnName ="";
    string DataType ="";
    string CharacterMaximumLength ="";
    string FlatFileFormatName ="";
    string MaxOrdinalPosition ="";
    string ColumnDelimeter =";";
   
    DataTable tFFFCol;
#>
<FileFormats>
    <# DataTable tFFF = ExternalDataAccess.GetDataTable(CnSrcDB,
    "SELECT t1.TABLE_CATALOG, t1.TABLE_SCHEMA, t1.TABLE_NAME,  max(t1.ORDINAL_POSITION) as MAX_ORDINAL_POSITION FROM  INFORMATION_SCHEMA.COLUMNS AS t1 INNER JOIN INFORMATION_SCHEMA.TABLES  AS t2 ON t1.TABLE_CATALOG = t2.TABLE_CATALOG  AND t1.TABLE_SCHEMA = t2.TABLE_SCHEMA AND t1.TABLE_NAME = t2.TABLE_NAME  WHERE        (t1.TABLE_CATALOG = '" +SrcTableCatalog + "') AND (t2.TABLE_TYPE = 'BASE TABLE')  and t2.TABLE_SCHEMA = '" + SrcTableSchema +"' group by t1.TABLE_CATALOG, t1.TABLE_SCHEMA, t1.TABLE_NAME");
    foreach (DataRow tFFFr in tFFF.Rows){
    TableCatalog =tFFFr[0].ToString();
    TableSchema = tFFFr[1].ToString();
    TableName = tFFFr[2].ToString();
    MaxOrdinalPosition = tFFFr[3].ToString();
    #>
    <FlatFileFormat FlatFileType="Delimited"     ColumnNamesInFirstDataRow="true"     RowDelimiter="CRLF"  Name="FFF <#=TableCatalog#> <#=TableSchema#> <#=TableName#>">
        <Columns>
            <# tFFFCol = ExternalDataAccess.GetDataTable(CnSrcDB,
            "SELECT t1.COLUMN_NAME, t1.DATA_TYPE, t1.CHARACTER_MAXIMUM_LENGTH, T1.ORDINAL_POSITION FROM INFORMATION_SCHEMA.COLUMNS AS t1 where t1.TABLE_CATALOG = '"+TableCatalog+"' and t1.TABLE_SCHEMA = '"+TableSchema+"' and t1.TABLE_NAME = '"+TableName+"'");
            foreach (DataRow tFFFRow in tFFFCol.Rows){
            ColumnName =tFFFRow[0].ToString();
            DataType = tFFFRow[1].ToString();
            CharacterMaximumLength = tFFFRow[2].ToString();   
                ColumnDelimeter = ";";
                if (MaxOrdinalPosition == tFFFRow[3].ToString()){ ColumnDelimeter = "CRLF"; } ;
                DataType = "String";                               
            #>
                <Column ColumnType="Delimited" DataType= "<#=DataType#>" Delimiter="<#=ColumnDelimeter#>" Length="4000" Name="<#=ColumnName#>"></Column>
            <# } #>
        </Columns>
    </FlatFileFormat>
    <#}  #>
</FileFormats>
    <Connections>
        <OleDbConnection
            Name="CnOleDBAdventureWorks2012"
            ConnectionString="Data Source=.\MSSQL2012;Initial Catalog=AdventureWorks2012;Provider=SQLNCLI10.1;Integrated Security=SSPI;Connect Timeout=30;"
            RetainSameConnection="true">
        </OleDbConnection>
        <# DataTable tFF = ExternalDataAccess.GetDataTable(CnSrcDB,
        "SELECT t1.TABLE_CATALOG, t1.TABLE_SCHEMA, t1.TABLE_NAME FROM  INFORMATION_SCHEMA.COLUMNS AS t1 INNER JOIN INFORMATION_SCHEMA.TABLES  AS t2 ON t1.TABLE_CATALOG = t2.TABLE_CATALOG  AND t1.TABLE_SCHEMA = t2.TABLE_SCHEMA AND t1.TABLE_NAME = t2.TABLE_NAME  WHERE        (t1.TABLE_CATALOG = '"
        +SrcTableCatalog+"') AND (t2.TABLE_TYPE = 'BASE TABLE') and t2.TABLE_SCHEMA = '"
        +SrcTableSchema+"' group by t1.TABLE_CATALOG, t1.TABLE_SCHEMA, t1.TABLE_NAME");
        foreach (DataRow tFFFr in tFFF.Rows){
        TableCatalog =tFFFr[0].ToString();
        TableSchema = tFFFr[1].ToString();
        TableName = tFFFr[2].ToString(); #>
            <FlatFileConnection Name="FF <#=TableCatalog#> <#=TableSchema#> <#=TableName#>"
                FileFormat="FFF <#=TableCatalog#> <#=TableSchema#> <#=TableName#>"
                FilePath="C:\\<#=TableCatalog#>.<#=TableSchema#>.<#=TableName#>.dat"></FlatFileConnection>           
            <# }#>
    </Connections>
    <Packages>
        <Package Name="Extract.All" ConstraintMode="Linear" FailParentOnFailure="false">
            <Tasks>
                <#    
DataTable t1 = ExternalDataAccess.GetDataTable(CnSrcDB,
"SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME FROM INFORMATION_SCHEMA.TABLES AS t2 WHERE (TABLE_CATALOG = '"
+SrcTableCatalog +"') AND (TABLE_TYPE = 'BASE TABLE') and t2.TABLE_SCHEMA = '"+SrcTableSchema+"' ");
foreach (DataRow row1 in t1.Rows){
    TableCatalog =row1[0].ToString();
    TableSchema = row1[1].ToString();
    TableName = row1[2].ToString(); #>
                <Dataflow  Name="DFT <#=TableCatalog#> <#=TableSchema#> <#=TableName#>">
                    <Transformations>
                        <OleDbSource
                            Name="ODS <#=TableCatalog#> <#=TableSchema#> <#=TableName#>"
                            ConnectionName="CnOleDBAdventureWorks2012">
                            <ExternalTableInput Table ="[<#=TableCatalog#>].[<#=TableSchema#>].[<#=TableName#>]">
                            </ExternalTableInput>
                        </OleDbSource>
                        <FlatFileDestination Name="DFF <#=TableCatalog#> <#=TableSchema#> <#=TableName#>"
                            ConnectionName="FF <#=TableCatalog#> <#=TableSchema#> <#=TableName#>"></FlatFileDestination>                       
                    </Transformations>
                </Dataflow>
                <# } #>
            </Tasks>
        </Package>
    </Packages>
</Biml>
Gives you:
image
Till Next Time