Snowflake SQL Query: -------------------------------------- drop database if exists ramu; use role accountadmin; --Database Creation create database if not exists ramu; --Specify the active/current database for the session. use ramu; create or replace TABLE RAMU.PUBLIC.dummy_table ( a VARCHAR(16777216), b int, c int ); INSERT INTO dummy_table VALUES ('foo' ,1, 5), ('bar' ,2, 6), ('baz' ,3, null); INSERT INTO dummy_table VALUES ('foo' ,10, 5); select count(*) from dummy_table where C is null; ; select min(b) from dummy_table; create or replace TABLE RAMU.PUBLIC.dq_check ( table_name VARCHAR(16777216), description VARCHAR(16777216), sql_query_1 VARCHAR(16777216), sql_query_2 VARCHAR(16777216), comparison_type VARCHAR(16777216) ); INSERT INTO dq_check VALUES ('dummy_table','Check if all the values of c are not null','select count(*) from dummy_table where C is null','select 0','equals'); INSERT INTO dq_check VALUES ('dummy_table','Check if all the values of a is unique','with a_dups as ( select a,count(*) as count_value from dummy_table group by a having(count_value>1)) select count(*) from a_dups','select 0','equals'); INSERT INTO dq_check VALUES ('dummy_table','Check if the minimum value of b column is 2','select min(b) from dummy_table','select 2','greater_equals'); select * from dq_check where table_name='dummy_table'; Python Code: ----------------------- #pip install snowflake-connector-python #pip install "snowflake-connector-python[pandas]" -t . #pip install pandas -t . from snowflake.connector import connect import pandas as pd import os def run_query(conn, query): cursor = conn.cursor() cursor.execute(query) cursor.close() def run_query1(conn, query): cursor = conn.cursor() cursor.execute(query) records=cursor.fetchone()[0] cursor.close() return records def execute_test( db_conn, script_1, script_2, comp_operator): print("1st SQL Query : ",script_1) result_1=run_query1(db_conn,script_1) print("2nd SQL Query : ", script_2) result_2 = run_query1(db_conn, script_2) print("result 1 = " + str(result_1)) print("result 2 = " + str(result_2)) # compare values based on the comp_operator if comp_operator == "equals": return result_1 == result_2 elif comp_operator == "greater_equals": return result_1 >= result_2 elif comp_operator == "greater": return result_1 > result_2 elif comp_operator == "less_equals": return result_1
You are most underrated TH-camr for data engineering.. thanks a lot for your time and effort for the videos. Can you please make some videos on spark streaming ? Thank you in advance.
Thank you so much for your kind words and support madhu bhardwaj! I truly appreciate your feedback 😊 Spark Streaming is definitely a topic I have in mind for future videos. I'll make sure to include it in my upcoming content roadmap. Stay tuned for more exciting videos on data engineering, including Spark Streaming. Thank you again for your encouragement and continued support! 🙏🎥
Snowflake SQL Query:
--------------------------------------
drop database if exists ramu;
use role accountadmin;
--Database Creation
create database if not exists ramu;
--Specify the active/current database for the session.
use ramu;
create or replace TABLE RAMU.PUBLIC.dummy_table (
a VARCHAR(16777216),
b int,
c int
);
INSERT INTO dummy_table VALUES ('foo' ,1, 5),
('bar' ,2, 6),
('baz' ,3, null);
INSERT INTO dummy_table VALUES ('foo' ,10, 5);
select count(*) from dummy_table where C is null;
;
select min(b) from dummy_table;
create or replace TABLE RAMU.PUBLIC.dq_check (
table_name VARCHAR(16777216),
description VARCHAR(16777216),
sql_query_1 VARCHAR(16777216),
sql_query_2 VARCHAR(16777216),
comparison_type VARCHAR(16777216)
);
INSERT INTO dq_check VALUES ('dummy_table','Check if all the values of c are not null','select count(*) from dummy_table where C is null','select 0','equals');
INSERT INTO dq_check VALUES ('dummy_table','Check if all the values of a is unique','with a_dups as (
select a,count(*) as count_value from dummy_table group by a having(count_value>1))
select count(*) from a_dups','select 0','equals');
INSERT INTO dq_check VALUES ('dummy_table','Check if the minimum value of b column is 2','select min(b) from dummy_table','select 2','greater_equals');
select * from dq_check where table_name='dummy_table';
Python Code:
-----------------------
#pip install snowflake-connector-python
#pip install "snowflake-connector-python[pandas]" -t .
#pip install pandas -t .
from snowflake.connector import connect
import pandas as pd
import os
def run_query(conn, query):
cursor = conn.cursor()
cursor.execute(query)
cursor.close()
def run_query1(conn, query):
cursor = conn.cursor()
cursor.execute(query)
records=cursor.fetchone()[0]
cursor.close()
return records
def execute_test(
db_conn,
script_1,
script_2,
comp_operator):
print("1st SQL Query : ",script_1)
result_1=run_query1(db_conn,script_1)
print("2nd SQL Query : ", script_2)
result_2 = run_query1(db_conn, script_2)
print("result 1 = " + str(result_1))
print("result 2 = " + str(result_2))
# compare values based on the comp_operator
if comp_operator == "equals":
return result_1 == result_2
elif comp_operator == "greater_equals":
return result_1 >= result_2
elif comp_operator == "greater":
return result_1 > result_2
elif comp_operator == "less_equals":
return result_1
Very detailed contents and brief explanation, thanks a lot for all the efforts.
Glad you liked it Mangesh!
You are most underrated TH-camr for data engineering.. thanks a lot for your time and effort for the videos.
Can you please make some videos on spark streaming ? Thank you in advance.
Thank you so much for your kind words and support madhu bhardwaj! I truly appreciate your feedback 😊 Spark Streaming is definitely a topic I have in mind for future videos. I'll make sure to include it in my upcoming content roadmap. Stay tuned for more exciting videos on data engineering, including Spark Streaming. Thank you again for your encouragement and continued support! 🙏🎥
Thank u so much🎉🎉
You are welcome aghulus! Happy Learning