Parallelism & Partitioning Techniques : Video 7 (HD)

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ก.ย. 2024

ความคิดเห็น • 46

  • @ethanguan8727
    @ethanguan8727 10 ปีที่แล้ว +1

    These videos are great. It's my lucky to find them here. It makes my day meaningful. Thank you.

  • @harpalegaurav
    @harpalegaurav 5 ปีที่แล้ว +1

    Ruchika really awesome tutorial... very good..

  • @user-th3cj3zq6v
    @user-th3cj3zq6v 4 ปีที่แล้ว

    Hi Ruchika, I am Ana Lilia Armas Martínez, and I want to share the following with all you: The Modulus partition attracted my attention because it came from the modular arithmetic introduced in 1801 by Carl Friedrich Gauss, so I imagined that it must have good characteristics (perhaps not well known or widespread) to generate partitions. In my research I found that the Modulus partition is used in tables that have some key column defined as of integer type - say column idColumn - since the Modulus operation is performed using the numbers dumped there. How does this operation work? Let’s see: The Modulus operation is performed between each of these numbers in idColumn and the number of processing nodes, the Modulus operation simply returns the remainder of the division between two numbers, for example if we have idColumn = 1815 and n = number of processing nodes = 50, when dividing 1815/50 the remainder would be 15, that is, the row whose idColumn = 1815 will be sent to node number 15; If we have idColumn = 300, when dividing 300/50 the remainder would be 0, so the row with idColumn = 300 would be sent to node 0, if idColumn = 65, when dividing 65/50 the remainder would be 15, so the row with idColumn = 65 would be sent to node 15, that is, in the same set as idColumn = 1815, etc. In this case n = 50, so we would have the following sets: 0,1,2,3, ...., 48,49; this is because the remainders of the integer divisions by 50 can only be 0,1,2,3, ... 49. So we will have all our data distributed in these sets. Well, i found in the book InfoSphere DataStage for Enterprise XML Data Integration the following: Like Hash, the partition size of Modulus partitioning is equally distributed as long as the data values in the key column are equally distribubuted. Because Modulus partitioning is simpler and faster than Hash, it provides a performance improvement in situations where you have a single integer key column. There are more:
    For Hash partitioning, in the situation where the number of unique key values is low, you can get partition skew, where one partition receives a much large percentage of rows than other partitions. Skew negatively affects performance. One way of correcting this partition skew is to add an additional key column…
    Well, in my experience I have seen that most tables contain a key column of type integer so we would use the Modulus partition, and if that table does not have one, we create a column of this type and avoid possible problems by wanting to use the Hash partition.
    It is a small contribution to the great work you have done in making these videos, I love them. Thank you very much dear.

  • @tuy7minh
    @tuy7minh 9 ปีที่แล้ว

    Excellent video. Admire your professionalism in explaining every details of Data Stage.

  • @aartishukla
    @aartishukla 10 ปีที่แล้ว +2

    Very clear video ...... I like ur all video .... thnx for sharing ur knowledge .....

    • @datastagetutorials
      @datastagetutorials  10 ปีที่แล้ว

      Thank you sou much Aarti!! I'm really glad you liked all my videos. Keep following my channel-TUTORIAL for more new videos!!!

    • @jagadeeshsapireddy8308
      @jagadeeshsapireddy8308 10 ปีที่แล้ว

      Tutorial hello i'm eagerly waiting for u r uploads, wen will u upload.....

    • @datastagetutorials
      @datastagetutorials  10 ปีที่แล้ว

      Hey Jagadeesh, thanks for your interest. Though I work full-time, I pull out some time to make these videos. Even then, I uploaded 29 videos in just 45 days. Please understand that!! :)

  • @destructeurman1
    @destructeurman1 10 ปีที่แล้ว

    i m a new in datastage and this video help a lot thanks!

  • @shirsendubasu8246
    @shirsendubasu8246 7 ปีที่แล้ว +2

    Mam - Can you please explain different scenarios where you need to use which partitioning technique, guess that is not very clear. Thanks !!

  • @303deewana
    @303deewana 8 ปีที่แล้ว +2

    why do we require other partitioning technique when auto partitioning is available over here to choose best partitioning technique for any specific stage . So when if u choose hash or auto for aggregator stage (just for example) then performance should be same???????????

    • @bouchard71
      @bouchard71 7 ปีที่แล้ว

      This is for beginners.

  • @Rajusrv
    @Rajusrv 10 ปีที่แล้ว +1

    In explaining RCP many times you used word record, as per my knowledge you have to use column instead. For ex. you said you have 10 records and you want only 4 records to pass in next stage while you need to say that you have 10 columns and in target you want data of only 4 columns.....Please clear it. Thanks for video.

  • @satyajeetroy7260
    @satyajeetroy7260 7 ปีที่แล้ว +5

    hey I just love your voice.....truely

  • @Sayon____bhattacharjee
    @Sayon____bhattacharjee 4 ปีที่แล้ว

    Dear Ma'am, thanks for your effort and it highly appreciable. I think way of explaining the video can improve a lot. It is very confusing the way you are explaining.

  • @shivkumargowda1722
    @shivkumargowda1722 10 ปีที่แล้ว +1

    Pls give us example for RCP, was bit bouncer, also when to use key based and key less partion with example

  • @manojgunda
    @manojgunda 8 ปีที่แล้ว +1

    Hash partitioning is well explained

  • @rameshgandham7058
    @rameshgandham7058 10 ปีที่แล้ว +1

    RCP is not clear ..How its works based on columns/records??

  • @nitingupta4314
    @nitingupta4314 9 ปีที่แล้ว

    fantastic video Ruchika ... Great job :) ..Keep it up!!

  • @dineshcharan5717
    @dineshcharan5717 9 ปีที่แล้ว

    You are a datastage guru !! Amazing work....

  • @srividhyakannan4379
    @srividhyakannan4379 9 ปีที่แล้ว +1

    Please explain how to sort on multiple columns

  • @chad280
    @chad280 10 ปีที่แล้ว +1

    very good video but,set option for preserve partition is not clear..can you please tell us what this set option is for?

    • @datastagetutorials
      @datastagetutorials  10 ปีที่แล้ว

      Honestly, I don't have a clear explanation for that. That option doesn't really make sense to me. I'm sorry & thanks for watching!! :)

  • @814531
    @814531 10 ปีที่แล้ว +1

    What is the difference b/w same and entire partition?

    • @datastagetutorials
      @datastagetutorials  10 ปีที่แล้ว

      I thought I made it pretty clear in the video. Please do watch it one more time. I can't come up with the good words now to make it more than clear as in the video.

  • @phanipvd
    @phanipvd 10 ปีที่แล้ว

    Thanks for Videos.. I have a question, is RCP at record level or columns level?

  • @MohibAlvi
    @MohibAlvi 8 ปีที่แล้ว +1

    Great video!!!!!!!!!!!!

  • @NSuneel
    @NSuneel 10 ปีที่แล้ว +1

    Hi, small doubt.
    I hv a job seq---->Tx----->DS
    How many processors it will take. I think we can find by using apt_dump_score.

    • @datastagetutorials
      @datastagetutorials  10 ปีที่แล้ว +4

      Yup...APT_DUMP_SCORE gives all the information regarding how the data is partitioned, nodal info, operator info, etc.
      I'll come that part once we finish all the basic stages. I'm thinking about making 'Advanced DS Tutorial Videos'. Hope you guys gonna appreciate me the same way!! :)

  • @amulraj4880
    @amulraj4880 10 ปีที่แล้ว +1

    super

    • @datastagetutorials
      @datastagetutorials  10 ปีที่แล้ว

      Thanks Amul. Good to see that you liked this video except RCP part. :)

  • @amulraj4880
    @amulraj4880 10 ปีที่แล้ว +1

    wow

  • @commonman3685
    @commonman3685 3 ปีที่แล้ว

    Hanumanthu is fan of Rushika ❤️❤️❤️

  • @ramakrishnavelamuri4125
    @ramakrishnavelamuri4125 6 ปีที่แล้ว

    to lengthy session, felt bore from middle. Divide it into 2 parts

  • @abdulraheem2874
    @abdulraheem2874 8 ปีที่แล้ว

    HI, Do you have any video that shows how to work on CDC in datastage

  • @julianeccleshall8397
    @julianeccleshall8397 10 ปีที่แล้ว

    please please please keep upload new video!!!

  • @swethav2744
    @swethav2744 5 ปีที่แล้ว +1

    Content was good but need better practice while explaining , many times it is annoying

  • @probikash
    @probikash 4 ปีที่แล้ว +1

    Or-k-straight.

  • @drawingwithrachel2847
    @drawingwithrachel2847 4 ปีที่แล้ว

    Rhema ggarac

  • @amulraj4880
    @amulraj4880 10 ปีที่แล้ว

    wowwwwwwwwwwwwwwwwwwwwwwwww

  • @krishnakireetithallam7463
    @krishnakireetithallam7463 5 ปีที่แล้ว +1

    Convey the concept clearly..always ur confusing

  • @kspkr
    @kspkr 9 ปีที่แล้ว +2

    LOL (ambient noises) dogs are barking at 15:49

  • @amulraj4880
    @amulraj4880 10 ปีที่แล้ว +1

    RCP is not clear

    • @datastagetutorials
      @datastagetutorials  10 ปีที่แล้ว

      Ohhh!! I'll try repeating it in some other video & try to make it more clear this time. Thanks for bringing it to my notice. :)

    • @amulraj4880
      @amulraj4880 10 ปีที่แล้ว

      Tutorial u r welcome
      when is the next release....