When starting with SSIS package, I can see everyone has tables loaded in their data base however I am struggling to find how to create those properly before start to filling those up with SSIS.
Hello Humair, Thank you for taking the time to watch the video and commenting to us. This presentation really was meant for focus on the design patterns uses within SSIS to manage dimension tables. It also is meant to showcase the features available within SSIS, the setup was done ahead of time and not provided for the webinar. What I would recommend is looking into our On-Demand Learning platform and checking out our Introduction to SSIS class which covers this concept in depth and has you build out the tables so you can run through the example. We as well cover more items in our Advanced SSIS class. Please let us know if you have any questions about our platform. This is the link for our On-Demand Learning courses free trial: www.pragmaticworkstraining.com/trial-registration/?source=odl_youtubeorg
Hello Mayank. Thanks for watching the video and taking the time to reach out to us. The quick answer to your question is No. They are not the same but it goes a little deeper. It is not uncommon to use the HASHBYTES function to create a column which can be used as a unique identifier but it has to be setup for that purpose. Remember we choose the columns that will be hashed using a certain algorithm. If the column chosen do not provide unique values for every row then it cannot be used as an Identifier column. So using the HASHBYTE function does not always output a value which can be used as a unique identifier but it CAN if you the user makes the right decisions on which columns to leverage. I hope this helps!
From the Video I am unable to see the complete script where the hash bytes ends. Can you share that script with me. Also, I am unable to understand how you are using the same colomn in the Dimensions Table for the Hash Bytes? From where they are taking those columns?
@@mayankchaubey5917Here is the script for the source: SELECT C.CustomerKey as CustomerAK ,CAST(D.LastName as varchar(100)) as LastName ,CAST(D.FirstName as varchar(100)) as FirstName ,CAST(CO.CompanyName as varchar(500)) as CompanyName ,CAST(D.Address1 as varchar(400)) as Address1 ,CAST(D.Address2 as varchar(400)) as Address2 ,CAST(D.City as varchar(400)) as City ,CAST(D.State as varchar(4)) as State ,CAST(D.PostalCode as varchar(20)) as PostalCode ,CAST(D.WorkPhone as varchar(40)) as WorkPhone ,CAST(D.HomePhone as varchar(40)) as HomePhone ,CAST(D.CellPhone as varchar(40)) as CellPhone ,CAST(D.WorkEmail as varchar(512)) as WorkEmail ,CAST(D.HomeEmail as varchar(512)) as HomeEmail ,CAST(HASHBYTES('SHA1',ISNULL(CAST(D.LastName as varchar(100)), 'UNKNOWN') +' | '+ISNULL(CAST(D.FirstName as varchar(100)), 'UNKNOWN') +' | '+ISNULL(CAST(CO.CompanyName as varchar(500)), 'UNKNOWN') +' | '+ISNULL(CAST(D.Address1 as varchar(400)), 'UNKNOWN') +' | '+ISNULL(CAST(D.Address2 as varchar(400)), 'UNKNOWN') +' | '+ISNULL(CAST(D.City as varchar(400)) , 'UNKNOWN') +' | '+ISNULL(CAST(D.State as varchar(4)), 'UNKNOWN') +' | '+ISNULL(CAST(D.PostalCode as varchar(20)), 'UNKNOWN') +' | '+ISNULL(CAST(D.WorkPhone as varchar(40)), 'UNKNOWN') +' | '+ISNULL(CAST(D.HomePhone as varchar(40)), 'UNKNOWN') +' | '+ISNULL(CAST(D.CellPhone as varchar(40)), 'UNKNOWN') +' | '+ISNULL(CAST(D.WorkEmail as varchar(512)), 'UNKNOWN') +' | '+ISNULL(CAST(D.HomeEmail as varchar(512)), 'UNKNOWN') )as bigint) as Hash_Source FROM [PWInsurance].[People].[Customer] C JOIN People.Detail D ON C.DetailKey = D.DetailKey JOIN People.Company CO ON C.CompanyKey = CO.CompanyKey And this is the script for the lookup query: SELECT CustomerSK as CustomerSK_Destination ,CustomerAK ,CAST(HASHBYTES('SHA1',ISNULL(CAST(LastName as varchar(100)), 'UNKNOWN') +' | '+ISNULL(CAST(FirstName as varchar(100)), 'UNKNOWN') +' | '+ISNULL(CAST(CompanyName as varchar(500)), 'UNKNOWN') +' | '+ISNULL(CAST(Address1 as varchar(400)), 'UNKNOWN') +' | '+ISNULL(CAST(Address2 as varchar(400)), 'UNKNOWN') +' | '+ISNULL(CAST(City as varchar(400)) , 'UNKNOWN') +' | '+ISNULL(CAST(State as varchar(4)), 'UNKNOWN') +' | '+ISNULL(CAST(PostalCode as varchar(20)), 'UNKNOWN') +' | '+ISNULL(CAST(WorkPhone as varchar(40)), 'UNKNOWN') +' | '+ISNULL(CAST(HomePhone as varchar(40)), 'UNKNOWN') +' | '+ISNULL(CAST(CellPhone as varchar(40)), 'UNKNOWN') +' | '+ISNULL(CAST(WorkEmail as varchar(512)), 'UNKNOWN') +' | '+ISNULL(CAST(HomeEmail as varchar(512)), 'UNKNOWN') )as bigint) as Hash_Destination FROM DimCustomer The key thing here as you can see in both queries is that the same columns are used in both queries. The order and amount of columns must be exact or you will get incorrect results and everything will show up as being changes from the source. Hope this helps
Hi, we are glad to hear you liked this video. I am not sure if you know, but we offer many On-Demand Learning courses, such as, Power BI, Azure, SQL Server, Data Science, Business Intelligence and much more. You should check out our FREE trial: www.pragmaticworkstraining.com/trial-registration/?source=odl_youtubeorg - No credit card required and you will get instant access of all our courses.
how I can get the data used in this tuto plaese ?
Thank you for this definitive tutorial, but what is AK abbreviation stands for?
Can you upload the source and destination DB with their corresponding tables ?
When starting with SSIS package, I can see everyone has tables loaded in their data base however I am struggling to find how to create those properly before start to filling those up with SSIS.
Hello Humair,
Thank you for taking the time to watch the video and commenting to us. This presentation really was meant for focus on the design patterns uses within SSIS to manage dimension tables. It also is meant to showcase the features available within SSIS, the setup was done ahead of time and not provided for the webinar. What I would recommend is looking into our On-Demand Learning platform and checking out our Introduction to SSIS class which covers this concept in depth and has you build out the tables so you can run through the example. We as well cover more items in our Advanced SSIS class. Please let us know if you have any questions about our platform. This is the link for our On-Demand Learning courses free trial: www.pragmaticworkstraining.com/trial-registration/?source=odl_youtubeorg
Does Hash bytes and Unique Identifiers are same?
Hello Mayank. Thanks for watching the video and taking the time to reach out to us. The quick answer to your question is No. They are not the same but it goes a little deeper. It is not uncommon to use the HASHBYTES function to create a column which can be used as a unique identifier but it has to be setup for that purpose. Remember we choose the columns that will be hashed using a certain algorithm. If the column chosen do not provide unique values for every row then it cannot be used as an Identifier column. So using the HASHBYTE function does not always output a value which can be used as a unique identifier but it CAN if you the user makes the right decisions on which columns to leverage. I hope this helps!
From the Video I am unable to see the complete script where the hash bytes ends. Can you share that script with me.
Also, I am unable to understand how you are using the same colomn in the Dimensions Table for the Hash Bytes?
From where they are taking those columns?
@@mayankchaubey5917Here is the script for the source:
SELECT C.CustomerKey as CustomerAK
,CAST(D.LastName as varchar(100)) as LastName
,CAST(D.FirstName as varchar(100)) as FirstName
,CAST(CO.CompanyName as varchar(500)) as CompanyName
,CAST(D.Address1 as varchar(400)) as Address1
,CAST(D.Address2 as varchar(400)) as Address2
,CAST(D.City as varchar(400)) as City
,CAST(D.State as varchar(4)) as State
,CAST(D.PostalCode as varchar(20)) as PostalCode
,CAST(D.WorkPhone as varchar(40)) as WorkPhone
,CAST(D.HomePhone as varchar(40)) as HomePhone
,CAST(D.CellPhone as varchar(40)) as CellPhone
,CAST(D.WorkEmail as varchar(512)) as WorkEmail
,CAST(D.HomeEmail as varchar(512)) as HomeEmail
,CAST(HASHBYTES('SHA1',ISNULL(CAST(D.LastName as varchar(100)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.FirstName as varchar(100)), 'UNKNOWN')
+' | '+ISNULL(CAST(CO.CompanyName as varchar(500)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.Address1 as varchar(400)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.Address2 as varchar(400)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.City as varchar(400)) , 'UNKNOWN')
+' | '+ISNULL(CAST(D.State as varchar(4)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.PostalCode as varchar(20)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.WorkPhone as varchar(40)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.HomePhone as varchar(40)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.CellPhone as varchar(40)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.WorkEmail as varchar(512)), 'UNKNOWN')
+' | '+ISNULL(CAST(D.HomeEmail as varchar(512)), 'UNKNOWN')
)as bigint) as Hash_Source
FROM [PWInsurance].[People].[Customer] C
JOIN People.Detail D
ON C.DetailKey = D.DetailKey
JOIN People.Company CO
ON C.CompanyKey = CO.CompanyKey
And this is the script for the lookup query:
SELECT
CustomerSK as CustomerSK_Destination
,CustomerAK
,CAST(HASHBYTES('SHA1',ISNULL(CAST(LastName as varchar(100)), 'UNKNOWN')
+' | '+ISNULL(CAST(FirstName as varchar(100)), 'UNKNOWN')
+' | '+ISNULL(CAST(CompanyName as varchar(500)), 'UNKNOWN')
+' | '+ISNULL(CAST(Address1 as varchar(400)), 'UNKNOWN')
+' | '+ISNULL(CAST(Address2 as varchar(400)), 'UNKNOWN')
+' | '+ISNULL(CAST(City as varchar(400)) , 'UNKNOWN')
+' | '+ISNULL(CAST(State as varchar(4)), 'UNKNOWN')
+' | '+ISNULL(CAST(PostalCode as varchar(20)), 'UNKNOWN')
+' | '+ISNULL(CAST(WorkPhone as varchar(40)), 'UNKNOWN')
+' | '+ISNULL(CAST(HomePhone as varchar(40)), 'UNKNOWN')
+' | '+ISNULL(CAST(CellPhone as varchar(40)), 'UNKNOWN')
+' | '+ISNULL(CAST(WorkEmail as varchar(512)), 'UNKNOWN')
+' | '+ISNULL(CAST(HomeEmail as varchar(512)), 'UNKNOWN')
)as bigint) as Hash_Destination
FROM DimCustomer
The key thing here as you can see in both queries is that the same columns are used in both queries. The order and amount of columns must be exact or you will get incorrect results and everything will show up as being changes from the source. Hope this helps
Bravo presentation.
Hi, we are glad to hear you liked this video. I am not sure if you know, but we offer many On-Demand Learning courses, such as, Power BI, Azure, SQL Server, Data Science, Business Intelligence and much more. You should check out our FREE trial: www.pragmaticworkstraining.com/trial-registration/?source=odl_youtubeorg - No credit card required and you will get instant access of all our courses.