Parallelism & Partitioning Techniques : Video 7 (HD)

Tutorial

มุมมอง 78 631

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 7 พ.ย. 2024

ความคิดเห็น • 46

@ethanguan8727 10 ปีที่แล้ว ⁺¹
These videos are great. It's my lucky to find them here. It makes my day meaningful. Thank you.
@harpalegaurav 5 ปีที่แล้ว ⁺¹
Ruchika really awesome tutorial... very good..
@user-th3cj3zq6v 4 ปีที่แล้ว
Hi Ruchika, I am Ana Lilia Armas Martínez, and I want to share the following with all you: The Modulus partition attracted my attention because it came from the modular arithmetic introduced in 1801 by Carl Friedrich Gauss, so I imagined that it must have good characteristics (perhaps not well known or widespread) to generate partitions. In my research I found that the Modulus partition is used in tables that have some key column defined as of integer type - say column idColumn - since the Modulus operation is performed using the numbers dumped there. How does this operation work? Let’s see: The Modulus operation is performed between each of these numbers in idColumn and the number of processing nodes, the Modulus operation simply returns the remainder of the division between two numbers, for example if we have idColumn = 1815 and n = number of processing nodes = 50, when dividing 1815/50 the remainder would be 15, that is, the row whose idColumn = 1815 will be sent to node number 15; If we have idColumn = 300, when dividing 300/50 the remainder would be 0, so the row with idColumn = 300 would be sent to node 0, if idColumn = 65, when dividing 65/50 the remainder would be 15, so the row with idColumn = 65 would be sent to node 15, that is, in the same set as idColumn = 1815, etc. In this case n = 50, so we would have the following sets: 0,1,2,3, ...., 48,49; this is because the remainders of the integer divisions by 50 can only be 0,1,2,3, ... 49. So we will have all our data distributed in these sets. Well, i found in the book InfoSphere DataStage for Enterprise XML Data Integration the following: Like Hash, the partition size of Modulus partitioning is equally distributed as long as the data values in the key column are equally distribubuted. Because Modulus partitioning is simpler and faster than Hash, it provides a performance improvement in situations where you have a single integer key column. There are more:
For Hash partitioning, in the situation where the number of unique key values is low, you can get partition skew, where one partition receives a much large percentage of rows than other partitions. Skew negatively affects performance. One way of correcting this partition skew is to add an additional key column…
Well, in my experience I have seen that most tables contain a key column of type integer so we would use the Modulus partition, and if that table does not have one, we create a column of this type and avoid possible problems by wanting to use the Hash partition.
It is a small contribution to the great work you have done in making these videos, I love them. Thank you very much dear.
@303deewana 8 ปีที่แล้ว ⁺²
why do we require other partitioning technique when auto partitioning is available over here to choose best partitioning technique for any specific stage . So when if u choose hash or auto for aggregator stage (just for example) then performance should be same???????????
@bouchard71 7 ปีที่แล้ว
This is for beginners.
@aartishukla 11 ปีที่แล้ว ⁺²
Very clear video ...... I like ur all video .... thnx for sharing ur knowledge .....
@datastagetutorials 11 ปีที่แล้ว
Thank you sou much Aarti!! I'm really glad you liked all my videos. Keep following my channel-TUTORIAL for more new videos!!!
@jagadeeshsapireddy8308 11 ปีที่แล้ว
Tutorial hello i'm eagerly waiting for u r uploads, wen will u upload.....
@datastagetutorials 11 ปีที่แล้ว
Hey Jagadeesh, thanks for your interest. Though I work full-time, I pull out some time to make these videos. Even then, I uploaded 29 videos in just 45 days. Please understand that!! :)
@shirsendubasu8246 7 ปีที่แล้ว ⁺²
Mam - Can you please explain different scenarios where you need to use which partitioning technique, guess that is not very clear. Thanks !!
@tuy7minh 9 ปีที่แล้ว
Excellent video. Admire your professionalism in explaining every details of Data Stage.
@destructeurman1 10 ปีที่แล้ว
i m a new in datastage and this video help a lot thanks!
@chad280 11 ปีที่แล้ว ⁺¹
very good video but,set option for preserve partition is not clear..can you please tell us what this set option is for?
@datastagetutorials 11 ปีที่แล้ว
Honestly, I don't have a clear explanation for that. That option doesn't really make sense to me. I'm sorry & thanks for watching!! :)
@satyajeetroy7260 7 ปีที่แล้ว ⁺⁵
hey I just love your voice.....truely
@Sayon____bhattacharjee 4 ปีที่แล้ว
Dear Ma'am, thanks for your effort and it highly appreciable. I think way of explaining the video can improve a lot. It is very confusing the way you are explaining.
@Rajusrv 10 ปีที่แล้ว ⁺¹
In explaining RCP many times you used word record, as per my knowledge you have to use column instead. For ex. you said you have 10 records and you want only 4 records to pass in next stage while you need to say that you have 10 columns and in target you want data of only 4 columns.....Please clear it. Thanks for video.
@dineshcharan5717 10 ปีที่แล้ว
You are a datastage guru !! Amazing work....
@814531 11 ปีที่แล้ว ⁺¹
What is the difference b/w same and entire partition?
@datastagetutorials 11 ปีที่แล้ว
I thought I made it pretty clear in the video. Please do watch it one more time. I can't come up with the good words now to make it more than clear as in the video.
@shivkumargowda1722 10 ปีที่แล้ว ⁺¹
Pls give us example for RCP, was bit bouncer, also when to use key based and key less partion with example
@srividhyakannan4379 10 ปีที่แล้ว ⁺¹
Please explain how to sort on multiple columns
@rameshgandham7058 10 ปีที่แล้ว ⁺¹
RCP is not clear ..How its works based on columns/records??
@phanipvd 11 ปีที่แล้ว
Thanks for Videos.. I have a question, is RCP at record level or columns level?
@manojgunda 9 ปีที่แล้ว ⁺¹
Hash partitioning is well explained
@commonman3685 3 ปีที่แล้ว
Hanumanthu is fan of Rushika ❤️❤️❤️
@nitingupta4314 9 ปีที่แล้ว
fantastic video Ruchika ... Great job :) ..Keep it up!!
@MohibAlvi 9 ปีที่แล้ว ⁺¹
Great video!!!!!!!!!!!!
@NSuneel 11 ปีที่แล้ว ⁺¹
Hi, small doubt.
I hv a job seq---->Tx----->DS
How many processors it will take. I think we can find by using apt_dump_score.
@datastagetutorials 11 ปีที่แล้ว ⁺⁴
Yup...APT_DUMP_SCORE gives all the information regarding how the data is partitioned, nodal info, operator info, etc.
I'll come that part once we finish all the basic stages. I'm thinking about making 'Advanced DS Tutorial Videos'. Hope you guys gonna appreciate me the same way!! :)
@amulraj4880 11 ปีที่แล้ว ⁺¹
super
@datastagetutorials 11 ปีที่แล้ว
Thanks Amul. Good to see that you liked this video except RCP part. :)
@abdulraheem2874 8 ปีที่แล้ว
HI, Do you have any video that shows how to work on CDC in datastage
@ramakrishnavelamuri4125 7 ปีที่แล้ว
to lengthy session, felt bore from middle. Divide it into 2 parts
@amulraj4880 11 ปีที่แล้ว ⁺¹
wow
@probikash 4 ปีที่แล้ว ⁺¹
Or-k-straight.
@julianeccleshall8397 10 ปีที่แล้ว
please please please keep upload new video!!!
@kspkr 9 ปีที่แล้ว ⁺²
LOL (ambient noises) dogs are barking at 15:49
@amulraj4880 11 ปีที่แล้ว ⁺¹
RCP is not clear
@datastagetutorials 11 ปีที่แล้ว
Ohhh!! I'll try repeating it in some other video & try to make it more clear this time. Thanks for bringing it to my notice. :)
@amulraj4880 11 ปีที่แล้ว
Tutorial u r welcome
when is the next release....
@swethav2744 6 ปีที่แล้ว ⁺¹
Content was good but need better practice while explaining , many times it is annoying
@krishnakireetithallam7463 5 ปีที่แล้ว ⁺¹
Convey the concept clearly..always ur confusing
@amulraj4880 11 ปีที่แล้ว
wowwwwwwwwwwwwwwwwwwwwwwwww
@drawingwithrachel2847 4 ปีที่แล้ว
Rhema ggarac

ต่อไป

เล่นอัตโนมัติ