If we have two different locations like I need to get 2 columns from one table and then 2 columns from another table. How can i create one single. External table for this scenario
I'm trying to create the external tables dynamically from a notebook. The schema is built from source parquet files with some data type mapping. when executing the statement using spark.sql() I get a syntax error where it is complaing about the WITH keyword.
Bro, i have one another scenario, in case of GCP BQ if we have some csv or avro stored in bucket( GCP cloud storage) and if we want to create tables in BQ using that avro, csv we can directly create tables , (( process -> BQ -> create table -> select location : gcp cloud -> select file format or it will automatically take this -> put table name and no need any extra information just click on crate table ... table will create in selective dataset ))) so do we have same option in azure like how we can create tables from files in azure
Hi Bro, Why do i create external table in dedicated sql pool to get the data from external resource. We can use OPENROWSET to get from external source? Hope we use dedicated SQL pool in dev environments right? ****** PLEASE RESPOND *******
External tables are defined once and can be used many times whereas OPENROWSET is mainly for ad-hoc / one time file querying. Performance is better in External Tables. External Table simply the queries, we only need to treat the external file as a Table (though virtual). Schema is pre-defined, so more structure. Permissions can be managed at table-level rather than storage account. Hope this clears.
Bro please try to paste a code(as syntax) you use to create external tables, data source , file format , scoped credentials, we can just copy that and replace with our keys and names.
HI WafaStudies, I want doing : the passage of data from one dataflow to another with the use of temporary tables in azure synapse and data flow but temporary table (ie, #TemporaryTable) not visible. I use synapse pipeline with sql pool, not serveless . i use data flow Are there no other alternatives with temporary tables? I want use temporary tables to split dataflows. Thank you ie
Temporary tables will leave only in that sessions the moment u jump to another dataflow session will chnage hence u cant access that. May b u can go with external table or table if that works
@@WafaStudies Thank you very much for your response. I am currently using Mapping data flow for my transformations. In this case, are there no other alternatives?
@@dataengineerazure2983 Use a script that writes a table to your storage, then pick that up with the next part of your transformation and then run a script that drops the intermediate table. Essentially simulating a temp table
Hi , Why do we create external table in dedicated sql pool to get the data from external resource. We can also use OPENROWSET to get from external source. Can u answer this? @WafaStudies
External tables are defined once and can be used many times whereas OPENROWSET is mainly for ad-hoc / one time file querying. Performance is better in External Tables. External Table simply the queries, we only need to treat the external file as a Table (though virtual). Schema is pre-defined, so more structure. Permissions can be managed at table-level rather than storage account. Hope this clears.
I am trying to create a serverless pool based on some of the Database templates provided by Microsoft (The one we get when we click the Gallery in the workspace). I am able to create the database but when trying to query any table, it throws an error message ""External table xxxxxxx is not accessible because content of directory cannot be listed". I have tried providing Blob Contributor role to my user and the workspace to the storage. All networks are allowed in the storage Firewall. Even the container's ACL has permissions on my principal. Yet the error never seem to go away and I am unable to figure out what else could be done. Any ideas/advice is appreciated.
Hi Maheer, I didn't get what is the actual difference between external tables and tables w.r.t dedicated sql pools. I have created one more adls account which is not linked to my synapse account and tried to query on blob and i am able to access the data. If we can access from adls which is not connected to synapse directly, what is the significance of this external tables?
@@WafaStudies Hello, I would like to convert my data in my azure table to the excel file but by selecting the file, I do not see the excel format but rather Avro, parquet, csv etc... Do you have an idea? THANK YOU
Hi WafaStudies, I want to use synapse for a project, ingest flat files with a copy data, dump them in a data lake and then create external tables in the synapse serverless. Then withdataflow do the transformations and take it to a SQL server, I wonder if this architecture would be well planned?
I love your classes, I am watching all of them, but the only thing people are missing are the syntaxes you use. It would have been very convenient, if you would have uploaded all of the the syntaxes. I personally copied each part of the following external table columns, plus the data type, and all, which was very time consuming. Thank you for uploading all of the videos though. REATE EXTERNAL TABLE dbo.NYCTaxi ( [DateID] int, [MedallionID] int, [HackneyLicenseID] int, [PickupTimeID] int, [dropoffTimeid] int, [PickGeograpthyID] int, [DropoffGeograpthyID] int, [PickupLatitude] float, [PickupLongitude] float, [PickupLatLong] varchar(8000), [DropoffLatitude] float, [DropoffLongLatitude] float, [DropoffLatLong] VARCHAR(8000), [Passengercount] int, [TripDurationSeconds] int, [TripDistanceMiles] float, [PaymentType] varchar(8000), [FareAmount] numeric(19,4), [SurchargeAmount] numeric(19,4), [TaxAmount] numeric(19,4), [TipAmount] numeric(19,4), [TollsAmount] numeric(19,4), [TotalAmount] numeric(19,4) ) with ( LOCATION = 'putyourpath', DATA_SOURCE = yourdatasource FILE_FORMAT = yourfileformat )
question, i have a json but the external data table is not working for Json, how can i find a solution for this?
Is it possible to create Primary and foreign key for Concept of Din and fact relationship in synapse
If we have two different locations like I need to get 2 columns from one table and then 2 columns from another table. How can i create one single. External table for this scenario
I really enjoy your videos. They are clear and helpful.
Thank you ☺️
how can we create INDEX on a column (type: nvarchar or INT) to a external tables? if yes please suggest.
How to create a external tables in dedicated SQL pool,,,,from particular columns in tables already in onprem SQL server
how i fetch data from sql pool using spring boot?
can you please help me in simple terms what is use of master key encryption ?.
I'm trying to create the external tables dynamically from a notebook. The schema is built from source parquet files with some data type mapping. when executing the statement using spark.sql() I get a syntax error where it is complaing about the WITH keyword.
Spark.sql syntax is diffrent and we need use serverles
sql pool, u need to follow T Sql syntax
Bro, i have one another scenario,
in case of GCP BQ if we have some csv or avro stored in bucket( GCP cloud storage) and if we want to create tables in BQ using that avro, csv we can directly create tables , (( process -> BQ -> create table -> select location : gcp cloud -> select file format or it will automatically take this -> put table name and no need any extra information just click on crate table ... table will create in selective dataset )))
so do we have same option in azure
like how we can create tables from files in azure
Hi Bro, Why do i create external table in dedicated sql pool to get the data from external resource. We can use OPENROWSET to get from external source?
Hope we use dedicated SQL pool in dev environments right?
****** PLEASE RESPOND *******
External tables are defined once and can be used many times whereas OPENROWSET is mainly for ad-hoc / one time file querying.
Performance is better in External Tables.
External Table simply the queries, we only need to treat the external file as a Table (though virtual). Schema is pre-defined, so more structure.
Permissions can be managed at table-level rather than storage account.
Hope this clears.
And when you have more than one parquet file in the same folder? Like *.parquet
Bro please try to paste a code(as syntax) you use to create external tables, data source , file format , scoped credentials, we can just copy that and replace with our keys and names.
HI WafaStudies, I want doing : the passage of data from one dataflow to another with the use of temporary tables in azure synapse and data flow but temporary table (ie, #TemporaryTable) not visible. I use synapse pipeline with sql pool, not serveless . i use data flow Are there no other alternatives with temporary tables? I want use temporary tables to split dataflows. Thank you ie
Temporary tables will leave only in that sessions the moment u jump to another dataflow session will chnage hence u cant access that.
May b u can go with external table or table if that works
@@WafaStudies Thank you very much for your response. I am currently using Mapping data flow for my transformations. In this case, are there no other alternatives?
@@dataengineerazure2983 Use a script that writes a table to your storage, then pick that up with the next part of your transformation and then run a script that drops the intermediate table. Essentially simulating a temp table
How many sessions more we have in azure synapse series
Can't say at this moment. I will keep on doing in sequence lets see how many it may take
Hi , Why do we create external table in dedicated sql pool to get the data from external resource. We can also use OPENROWSET to get from external source. Can u answer this?
@WafaStudies
External tables are defined once and can be used many times whereas OPENROWSET is mainly for ad-hoc / one time file querying.
Performance is better in External Tables.
External Table simply the queries, we only need to treat the external file as a Table (though virtual). Schema is pre-defined, so more structure.
Permissions can be managed at table-level rather than storage account.
Hope this clears.
I am trying to create a serverless pool based on some of the Database templates provided by Microsoft (The one we get when we click the Gallery in the workspace). I am able to create the database but when trying to query any table, it throws an error message ""External table xxxxxxx is not accessible because content of directory cannot be listed". I have tried providing Blob Contributor role to my user and the workspace to the storage. All networks are allowed in the storage Firewall. Even the container's ACL has permissions on my principal. Yet the error never seem to go away and I am unable to figure out what else could be done. Any ideas/advice is appreciated.
Hi, were you able to resolve this issue? I am experiencing this now. Thanks!
Please how can I get the code you used? can you please share?
Hi Maheer, I didn't get what is the actual difference between external tables and tables w.r.t dedicated sql pools. I have created one more adls account which is not linked to my synapse account and tried to query on blob and i am able to access the data. If we can access from adls which is not connected to synapse directly, what is the significance of this external tables?
Where do you query your blob from? What tool do you use?
can we update data in external table or CETAS table ?
Great explanation
Thanks ☺️
@@WafaStudies Hello, I would like to convert my data in my azure table to the excel file but by selecting the file, I do not see the excel format but rather Avro, parquet, csv etc... Do you have an idea? THANK YOU
Bro nice video thanks for that..
Can i get the PPT ..?
please paste the code
Hi WafaStudies, I want to use synapse for a project, ingest flat files with a copy data, dump them in a data lake and then create external tables in the synapse serverless. Then withdataflow do the transformations and take it to a SQL server, I wonder if this architecture would be well planned?
What the fuck lol. Why wouldn't you just ingest the files in SQL Server in the first place.
I am getting a lot of ads in between in your videos, I am getting very much distracted.
too many ads, can't properly watch this video
What’s the major difference between video 16 and video 18
I love your classes, I am watching all of them, but the only thing people are missing are the syntaxes you use. It would have been very convenient, if you would have uploaded all of the the syntaxes. I personally copied each part of the following external table columns, plus the data type, and all, which was very time consuming. Thank you for uploading all of the videos though. REATE EXTERNAL TABLE dbo.NYCTaxi
(
[DateID] int,
[MedallionID] int,
[HackneyLicenseID] int,
[PickupTimeID] int,
[dropoffTimeid] int,
[PickGeograpthyID] int,
[DropoffGeograpthyID] int,
[PickupLatitude] float,
[PickupLongitude] float,
[PickupLatLong] varchar(8000),
[DropoffLatitude] float,
[DropoffLongLatitude] float,
[DropoffLatLong] VARCHAR(8000),
[Passengercount] int,
[TripDurationSeconds] int,
[TripDistanceMiles] float,
[PaymentType] varchar(8000),
[FareAmount] numeric(19,4),
[SurchargeAmount] numeric(19,4),
[TaxAmount] numeric(19,4),
[TipAmount] numeric(19,4),
[TollsAmount] numeric(19,4),
[TotalAmount] numeric(19,4)
)
with
(
LOCATION = 'putyourpath',
DATA_SOURCE = yourdatasource
FILE_FORMAT = yourfileformat
)
thank you for pasting the the column syntax here, very helpful😊