Cloud Data Fusion: Data Integration at Google Cloud (Cloud Next '19)

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ก.ค. 2024
  • Data integration is a critical service for companies looking to run analytics in the cloud. Whether it's connecting on-premises data to the cloud, moving data between clouds, or transforming data that's already in your cloud of choice, data integration helps unlock the value in your by readying it for analysis. In this session, we'll discuss data integration at Google Cloud, with a customer example from Telus.
    Cloud Data Fusion → bit.ly/2U1Cov6
    Watch more:
    Next '19 Data Analytics Sessions here → bit.ly/Next19DataAnalytics
    Next ‘19 All Sessions playlist → bit.ly/Next19AllSessions
    Subscribe to the GCP Channel → bit.ly/GCloudPlatform
    Speaker(s): Ryan Lippert, Nitin Motgi, Robert Medeiros
    Session ID: DA214
    product: Cloud - General; fullname: Ryan Lippert, Nitin Motgi, Robert Medeiros; event: Google Cloud Next 2019;
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 22

  • @californiaesnuestra
    @californiaesnuestra 5 ปีที่แล้ว +1

    Great presentation, very important tool in our Data/ ML pipelines

  • @Slaneshhhh
    @Slaneshhhh 5 ปีที่แล้ว +1

    Amazing, exactly what I was looking for!

  • @parvindersingh
    @parvindersingh 5 ปีที่แล้ว +6

    I've no words for Google. It always lives in next generation. I'm happy to work on GCP products.

  • @dsinghr
    @dsinghr 4 ปีที่แล้ว +3

    This is what everyone wants. Bang on. We have used cloud dataflow in past and it was a nightmare. Not the development but it has to go through a very long process before pipeline can be deployed into production. E.g. code review, testing, SIT, code quality checks, check for usage of unapproved libraries etc. This looks like informatica + cognos + control M to me.

  • @gopinathGopiRebel
    @gopinathGopiRebel 3 ปีที่แล้ว

    Does data fusion have the pipelibe resume capability incase of manual errors and we need not to run all the pipeline again ?

  • @vighnesh.pvicky9286
    @vighnesh.pvicky9286 4 ปีที่แล้ว

    I am trying to use my own python code in the python transformer plugin. But I am facing no module : py4j even though I installed and mentioned in the PYTHONPATH and Python binary path in the NATIVE mode of execution.
    Can anyone please help me with this python transformer

  • @hectar1156
    @hectar1156 3 ปีที่แล้ว +1

    Can we execute stored procedures in BQ via fusion? Thanks

  • @akellasrikanth2700
    @akellasrikanth2700 3 ปีที่แล้ว +1

    Does It help to bring marketing analytics data from various sources into cloud slike azure or Google cloud ?

  • @chantiyarlagadda4463
    @chantiyarlagadda4463 4 ปีที่แล้ว

    Hi..is there any way to trigger pipelines from outside events, like from cloud functions or trigger pipeline on file arrival GCS.,Please direct me if you have any documentation on that

    • @narendrann1509
      @narendrann1509 4 ปีที่แล้ว

      Hi Did you get the solution ? If yes pls let me know Coz I'm also looking for that

  • @HananeOuhammouch
    @HananeOuhammouch 5 ปีที่แล้ว +1

    Great presentation, and Great tool
    I have 2 questions please can you tell me what do you exactly mean by the type File in (Source, Sink), and is it also possible to send the result of the pipeline directly to a FTP server
    Thank you

    • @sureshraina321
      @sureshraina321 2 ปีที่แล้ว

      source is source file and sink to target location.

  • @Vastedge
    @Vastedge 2 ปีที่แล้ว

    Precisely Presented

  • @Filip-ci3ng
    @Filip-ci3ng 3 ปีที่แล้ว +1

    How to list files from a bucket, apply arbitrary logic and load some of them ? Can I write run Python script with gsutil capabilities in data Fusion ? Too many demonstrations of easy to do things that do not apply to real life :(

  • @kpbalaji
    @kpbalaji 5 ปีที่แล้ว +2

    While every programmer shouts on top of their voice why hand coding is better than drag and drop ETL, the big boys are creating tools for everyone to adopt.

  • @abcXyz-tr3uj
    @abcXyz-tr3uj 4 ปีที่แล้ว

    We are trying to bring data from AWS RDS to Big query using data fusion pipeline. Can you please tell us how to connect Data Fusion to RDS without making RDS endpoint available to the public (0.0.0.0/0). Presently we are getting connection time out error.

  • @dsinghr
    @dsinghr 4 ปีที่แล้ว

    I have worked with few major banking using GCP. Why none of them using fusion? They all are building their own data pipelines using Airflow and DataFlow

    • @Filip-ci3ng
      @Filip-ci3ng 2 ปีที่แล้ว

      because data fusion is too basic

  • @hectar1156
    @hectar1156 3 ปีที่แล้ว

    GCP response to Azure ADF and AWS glue / data pipeline.

  • @rahul21stcentury
    @rahul21stcentury 2 ปีที่แล้ว

    Datafusion is the worst tool that I have worked on, really pathetic

    • @turkishiconic7629
      @turkishiconic7629 ปีที่แล้ว

      What are those shortcomings that you came across?