ความคิดเห็น •

  • @rajasaroj2297
    @rajasaroj2297 2 ปีที่แล้ว +3

    Thank much it was really a simple and best explanation for those configs.

  • @RP-sx6kf
    @RP-sx6kf 8 หลายเดือนก่อน

    Thank you very much for the detailed explanation and it gave very good understanding on how these properties help in running the spark job. Really appreciate your help in educating the tech community 👏👏

  • @Jay-vh3gx
    @Jay-vh3gx 19 วันที่ผ่านมา

    Thank you for crystal clear explanation 🎉

  • @praketasaxena
    @praketasaxena ปีที่แล้ว

    Thank You Sir ! Namaskaaram !

  • @akberj4544
    @akberj4544 3 ปีที่แล้ว +1

    Very useful video Anna. Thanks Much! Anna requesting to please make a video on the Real-Time project which is done in Industries as one video. Similarly, as a continuation make another video on, "what sort of question we get on that same real-time project in real-time interviews. Please Please Anna Please make a video on this. Thanks in advance.

  • @cognologix-data-ai-engg
    @cognologix-data-ai-engg ปีที่แล้ว

    superb explanation bro thanks a lot

  • @CrashLaker
    @CrashLaker 2 ปีที่แล้ว

    Hi! great content! i'm wondering how yarn container vpu mem size works with executors.

  • @souhailaakrikez1740
    @souhailaakrikez1740 9 หลายเดือนก่อน

    Do executors themselves run in parallel in Spark, or is it just the tasks within them?

  • @mohans3143
    @mohans3143 2 ปีที่แล้ว +1

    I have a 250gb file to process and I used dynamic allocation. when I try to run the job it is giving an error job got aborted due to stage failure. how do I fix this issue?

  • @prabas5646
    @prabas5646 4 หลายเดือนก่อน

    Clearly explained

  • @praptijoshi9102
    @praptijoshi9102 6 หลายเดือนก่อน

    Just Amazing

  • @swapnilpatil6986
    @swapnilpatil6986 6 หลายเดือนก่อน

    Can we say that cores are actual available threads in spark,
    As core can run multiple tasks .
    So its not always one core for one task.
    A core can multitask.
    Can you confirm this?

  • @nagamanickam6604
    @nagamanickam6604 6 หลายเดือนก่อน

    Thank you

  • @ranitdey5829
    @ranitdey5829 2 ปีที่แล้ว

    This is great. Thanks!

  • @w0nderw0manjy
    @w0nderw0manjy ปีที่แล้ว

    Good explanation 👌

  • @Amarjeet-fb3lk
    @Amarjeet-fb3lk 4 หลายเดือนก่อน

    If no. of cores are 5 per executor,
    At shuffle time, by default it creates 200 partitions,how that 200 partitions will be created,if no of cores are less, because 1 partition will be stored on 1 core.
    Suppose, that
    My config is, 2 executor each with 5 core.
    Now, how it will create 200 partitions if I do a group by operation?
    There are 10 cores, and 200 partitions are required to store them, right?
    How is that possible?

  • @RohitSaini52945
    @RohitSaini52945 2 ปีที่แล้ว

    you have a great teaching skill. Kudos!

  • @thenaughtyanil
    @thenaughtyanil ปีที่แล้ว

    Can we use sparksession on worker node. Facing issue with accessing spark session on worker nodes. Pls hp

  • @mahak8914
    @mahak8914 ปีที่แล้ว

    Hi, is it possible to create multiple executors on my personal laptop having 6 cores and 16 gb RAM?

  • @ultimo8458
    @ultimo8458 11 หลายเดือนก่อน

    i have applay 4x memory in each core for 5Gb file but no luck can you please help me to how to resolve this issue
    Road map:
    1)Find the number of partition -->5GB(10240mb)/128mb=40
    2)find the CPU cores for maximum parallelism -->40 cores for partition
    3)find the maximum allowed CPU cores for each executor -->5 cores per executor for Yarn
    4)number of executors=total cores/executor cores -> 40/5=8 executors
    Amount of memory is required
    Road map:
    1)Find the partition size -> by default size is 128mb
    2)assign a minimum of 4x memory for each core -> what is applay ???????
    3)multiple it by executor cores to get executor memory ->????

  • @aarthiravindrakumar6153
    @aarthiravindrakumar6153 2 ปีที่แล้ว

    Love yur content 😊

  • @prabhudevbm2921
    @prabhudevbm2921 9 หลายเดือนก่อน

    thanks

  • @subhadeepkoley2592
    @subhadeepkoley2592 3 ปีที่แล้ว

    Nice explanation 😊✌️

  • @shajeep9170
    @shajeep9170 2 ปีที่แล้ว

    nice explanation

  • @AkshayKangude-nw1xh
    @AkshayKangude-nw1xh ปีที่แล้ว

    What configuration will required for 250GB data?

  • @sahilll7694
    @sahilll7694 2 ปีที่แล้ว +1

    How spark gets its metadata

  • @ardavanmoinzadeh801
    @ardavanmoinzadeh801 2 ปีที่แล้ว

    Can you explain why does Spark spill to disk and what cause this? I understand that in wide transformation or groupbykey statement where data is too big to fit in memory then spark has no choice but to spill it to disk ; my question is if we can minimize this with any performance tuning like bucketing/mapside join,etc...

    • @sandip9901
      @sandip9901 ปีที่แล้ว

      We can increase number of shuffle partitions and also we can adopt salting technique to increase no. of unique keys and increase cardinality to avoid skewness.
      If none works we can increase executor cores or memory.

    • @brightinstrument1621
      @brightinstrument1621 ปีที่แล้ว

      we can increase the partition size

  • @ThePrasanna77
    @ThePrasanna77 3 ปีที่แล้ว

    Please can you explain this video in Tamil. It will be very helpful for me. Thank you

  • @NAJEEB111
    @NAJEEB111 3 ปีที่แล้ว +2

    bro please we want projects on big data

  • @bhargavhr8834
    @bhargavhr8834 3 ปีที่แล้ว

    Gold