Optimizing your equals() methods with Pattern Matching - JEP Cafe #21

Java

มุมมอง 16 163

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 29 ก.ย. 2024

ความคิดเห็น • 62

@nuclearscissors 10 หลายเดือนก่อน ⁺²⁷
José, you're a beacon of light for the Java community! You (when it comes to Java) and Josh Long (when it comes to Spring Boot) never cease to amaze me with your interesting and in-depth explanations of all kinds of modern stuff in the Java world. Please never stop!
@JosePaumard 10 หลายเดือนก่อน ⁺⁶
Thank you for your kind words, I really appreciate them! And I'll pass the compliment to Josh, I'm sure he will appreciate it also.
@HenningPottker 10 หลายเดือนก่อน
Thanks for this great presentation! It's really amazing how it goes into small details while the overall structure remains absolutely clear. The consistent pacing that keeps up the interest and allows to easily follow for more than 30 minutes is true craftsmanship.
It would have been interesting to also see the equals method that is generated by Lombok. It's of course impossible to measure in a reliable way but I wouldn't be surprised if there are more Lombok than IDE generated equals methods out there in the world of Java business applications.
@reinhapa 10 หลายเดือนก่อน ⁺⁴
Hi José, you correctly point out that one has to measure and don't guess. I know that a lot of mistakes can be done writing JMH tests too. Could you also show the important parts of those tests in your examples too? I think this could be also very informative. Besides that it is always a great source of information!
@JosePaumard 10 หลายเดือนก่อน ⁺⁴
Of course. There is no trick, the code is in the video at 15:18. All the classes I use are records with the equals() methods that are also showed. So there are 9 record classes with the different equals() methods. And then two runs, one with no glitch (4 data sets) and another one with gliches (6 data sets).
The JMH configuration is the following: 5 warmup iterations, 10 measurement iterations, fork is 3, and each iteration is 400 milliseconds. All this depends on your machine and the errors you have. If it's too high, then you need to either have longer runs, or increase the measurement iterations.
What may change your result is the locality of your data. These are records, so rather small objects, and I use ArrayList to store them. So the locality shouldn't be too bad.
@svalyavasvalyava9867 10 หลายเดือนก่อน ⁺¹
Brilliant video, as always. Thank you!
@ZemenFidel 10 หลายเดือนก่อน ⁺²
The Apache Commons's Builder-based code is hardly more "readable" and definitely seems about 20-times slower than the rest.
@NachtmahrNebenan 10 หลายเดือนก่อน ⁺¹
Any dependency is a potential source of bugs, attack vectors, or end of life problems like using deprecated or even removed API.
7 หลายเดือนก่อน ⁺²
Take care, you are posting a lot of videos, that is plenty of coffe.... Anyway, thanks you for the great content!
@prdoyle 10 หลายเดือนก่อน ⁺¹²
José I also wanted to thank you for the cafe theme here. I'm in a position where I still need to avoid actual cafes due to Covid, and I really miss real cafe chats with colleagues. I know it's just a gimmick, but the cafe setting really resonates with me.
@JosePaumard 10 หลายเดือนก่อน ⁺³
Thank you, and sorry to read that. I hope you'll be better soon enough!
@Talaria.School 10 หลายเดือนก่อน ⁺⁵
Thanks a lot for this material, really relevant. Hat down for Mister José Paumard.
@JosePaumard 10 หลายเดือนก่อน ⁺¹
Thank you Khaled! 👍
@khmarbaise 10 หลายเดือนก่อน ⁺⁵
Many thanks José for this informative explanations. Great video again.
@JosePaumard 10 หลายเดือนก่อน ⁺²
Thank you!
@UHecker 3 หลายเดือนก่อน
I like the format of the JEP Café, so first of all thank you for all the input you gave me for my daily work. In the last months the videos became longer and longer, so now it is more a lunch break than a coffee break for me... For me, it is a bit difficult to spend half an hour during work to watch a video, whilst i often can take ten minutes for a "coffee break". Dear José and team, do you think that maybe it is possible to come "back to the roots" and make shorter JEP Café videos?
@brixomatic 10 หลายเดือนก่อน
Oh, dear.. José, I find it mildly dangerous that you're hyping versions of the equals method, that break its contact.
Example: if B extends A, and B just inherits the fields on A, just adds functionality, then using instanceof or pattern matching, objects of class B can be equal objects of class A, but not the other way round, which is a breach of the equals contract and can cause hard to find bugs, for example in searching and sorting algorithms.
@SterileNeutrino 9 หลายเดือนก่อน
Remember when we had to implement the hashmap by hand in Modula-2 at uni? So long ago. No maps, no lists in the language. Only arrays. Now we have AI-assisted code generation... which I'm gonna try soon... I'm not trusting it but let's see what the little stochastic parrot tells me ...
@ZelenoJabko 9 หลายเดือนก่อน
It's a bad Java design that every object has hashCode. It should be an interface called Hashable, and then Set requiring an instance of hashable.
Right now, you have to implement hashCode defensively, just in case somebody is going to put it into a hash set.
@laurentjeanpierre3662 10 หลายเดือนก่อน ⁺¹
Hi José. Thanks for this valuable insight. I wonder if the same result would be achieved with a more complicated object to compare (with Strings maybe?). Then first checking for equality may be wise.
@dirkj.3234 10 หลายเดือนก่อน ⁺¹
You are comparing apples with oranges when you compare implementations of equals that check for same class and others that use instanceof. This will give different results when you have subclasses!
@JosePaumard 10 หลายเดือนก่อน ⁺²
We are using records here, so no subclass. And yes, when you subclass a class that has an equals() method, you should always carefully check it, and override it when needed. Replacing instanceof with a class check may look like you are solving your problem, but if you add state in your subclass, it will probably not.
@dirkj.3234 10 หลายเดือนก่อน ⁺¹
@@JosePaumardYou are of course right for the records 👍🏻
For common classes I learned to compare class-instances if there may be subclasses. If an instance of a subclass is equal to an instance of its superclass it may probably break the transitivity of equals. So only use instanceof in final classes.
@JosePaumard 10 หลายเดือนก่อน ⁺¹
@@dirkj.3234I agree, this is the kind of thing you need to have in mind when you are designing your object model.
If you cannot have a final class for some reason, then you can also protect yourself by making your equal method final (this is what is done with the JEP). But it still a weak protection, as someone else can easily remove it. And you'll end up with objects that are equal when they are not of the same type.
What's important to keep in mind imho is that instanceof (and pattern matching) are not only checking the exact type, they'll be true for the subtypes. Learning a solution is nice, but it's better to understand the root cause of the problem ;)
@dirkj.3234 10 หลายเดือนก่อน
@@JosePaumard Thanks. I'm pretty sure that I know how it's working 😉
instanceof gives true for every instance of a subclass and that can be a problem if the subclass contains additional fields.
@slr150 10 หลายเดือนก่อน ⁺¹
Is try-catch expensive as if statements, What happens if you eliminate all if statements.
public boolean equals(Object o) {
try {
Point p = (Point)o;
return p.x == x && p.y == y;
}
catch(ClassCastException e) {
return false;
}
}
@JosePaumard 10 หลายเดือนก่อน ⁺²
It's usually more expensive. You can try to bench it though.
@prdoyle 10 หลายเดือนก่อน ⁺²
The object identity check is important for larger records, especially if they contain nested records. Checking the contents would be O(n) in the total number of fields, while checking identity is O(1). I suspect checking the identity is only slow with records composed entirely of primitive fields.
@christianschafer3724 10 หลายเดือนก่อน ⁺¹
That's a good point. It would be great to have the source code for this benchmark so it would be easy to extend it to more complex objects and see if it makes a difference.
@beckerdo 10 หลายเดือนก่อน ⁺¹
Great deep dive! Who would have thought that the equals method, visited by so many people, could have new insights? Thank you.
@privettoli 10 หลายเดือนก่อน ⁺²
Optimizing for what?
@corinnarust 10 หลายเดือนก่อน ⁺¹
Pattern matching is faster than if's
@privettoli 10 หลายเดือนก่อน ⁺²
@@corinnarust sounds like an opportunity for compiler optimization.
@SWinxyTheCat 10 หลายเดือนก่อน
My equals methods are always of the form of return this == obj || obj instanceof MyClass other && x == other.x && y.equals(other.y);
@VuLinhAssassin 10 หลายเดือนก่อน ⁺¹
I hope there is special sector for JPA entity too. I've been following JPA Buddy advices when generating equals/hashCode, but would Jose deliver us that too?
@JosePaumard 10 หลายเดือนก่อน ⁺¹
Sorry to disappoint you but no, this point is not covered.
@VuLinhAssassin 10 หลายเดือนก่อน ⁺¹
@@JosePaumardThen would you be able to cover that one day?
@JosePaumard 10 หลายเดือนก่อน ⁺²
@@VuLinhAssassin It's mostly an ill-posed problem, because of the life cycle of an entity, and because you can observe all the steps of this life cycle. So I'm not sure that there is any satisfying answer to that question.
For instance: you create an entity, its primary key is not set yet. At some point its primary key is set. For some reason you need to store this entity in a HashSet. If you add it before it has its primary key set, and check if it's there with a contains() when its primary key has been set, you'll be happy not to have taken into account the primary key in the equals / hashCode implementations. Is this what you would expect?
@VuLinhAssassin 10 หลายเดือนก่อน
@@JosePaumard JPA Buddy plugin suggested I use the hashCode of the class (it can be a normal entity class or a Hibernate proxy class, so a check instanceof HibernateProxy is needed), like getClass().hashCode(). What do you think of this implementation?
@desoroxxx 10 หลายเดือนก่อน ⁺⁶
Problem is that branch predictor's vary greatly on different CPU's, so it would have been great to run all of this on different CPU's from different vendors
@thiagohenriquehupner1164 10 หลายเดือนก่อน
With these numbers, would it be better to remove the instance equals check in the default record implementation?
Source: jdk/src/java.base/share/classes/java/lang/runtime/ObjectMethods.java:225
@loic.bertrand 10 หลายเดือนก่อน ⁺¹
2:58 In this example, we iterate on an empty set and add its own elements to it, isn't this weird?
@JosePaumard 10 หลายเดือนก่อน ⁺¹
😆Indeed it is. Maybe a this somewhere coud fix that?
@prdoyle 10 หลายเดือนก่อน
Skipping the instance check can also be valuable when memory is tight. If you use interning on your records, then identical records are represented by the same object (which can save memory) and then the instance check will not only succeed more often, but will also avoid loading the fields from memory, thereby reducing cache footprint.
@lapissea1190 10 หลายเดือนก่อน
I appreciate the attention to the lower level performance considerations of java. A lot of people seem to neglect that and just leave it to the "the jit will fix it" without actually understanding what the jit does
@delanym 10 หลายเดือนก่อน
Does the number of instance fields matter for comparing these implementations?
@JosePaumard 10 หลายเดือนก่อน ⁺⁵
It could, it really depends on the complexity of the states you need to compare. Especially when you compare objects that are equal. In that case, you execute all the tests before knowing that the result is true.
@cosmowanda6460 10 หลายเดือนก่อน
How do you measure performance? Was that a dependency or?
@JosePaumard 10 หลายเดือนก่อน ⁺³
You want to use JMH for that -> github.com/openjdk/jmh
@prdoyle 10 หลายเดือนก่อน
I think the hash bucket becomes a tree only if the value type implements Comparable. Otherwise you're stuck with a list.
@JosePaumard 10 หลายเดือนก่อน ⁺²
No, it also works with keys that are not Comparable. It uses System.identityHashCode() in that case.
@prdoyle 6 หลายเดือนก่อน
@@JosePaumardthanks! I stand corrected.
@Speiger 10 หลายเดือนก่อน
Yeah i have seen this example before where checking if(this == o) just is slower.
My mentality is fairly simple in that regard.
Simply manually do the checks yourself in your head and if it seems slower then you want to compare.
Though at some point these few nano seconds do not make a difference anymore.
If you are comparing collections, it is a good idea to have this slower check in place because it can save you milliseconds if the size is large enough or the comparison logic is just slower due to the type of implementation. (Sets are faster to compare then lists for example)
@JosePaumard 10 หลายเดือนก่อน ⁺²
The problem is that predictive branching is not the only issue. this == other also messes up with your GC. So for large objects and in a real application, things get more complicated, and not in favor of instance check.
@Speiger 10 หลายเดือนก่อน ⁺¹
@@JosePaumard how does it mess with your GC? Could you elaborate on that?
Well yeah in most real world applications it doesn't help, but there is scenarios where the small check cost outperforms if the comparison of the object itself is expensive enough..
@JosePaumard 10 หลายเดือนก่อน
@@Speiger th-cam.com/video/tWonozjIE-s/w-d-xo.html There is a more elaborate answer in a talk we did in french.
@DrissOutraah 10 หลายเดือนก่อน
Jaava 🎉
@luisdanielmesa 10 หลายเดือนก่อน
why not just return (this == other || (other instanceof Point(int x, int y) && this.x == x && this.y == y)) there's no branching just short-circuiting
@VuLinhAssassin 10 หลายเดือนก่อน ⁺²
You can just do the branching here, and trust me, the compiler WILL optimize it for you, that's why you will see your code and decompiled code different, for example, turning preemptive return into nested if-else.
@onchuner 10 หลายเดือนก่อน
why not add support of operator overloading

ต่อไป

เล่นอัตโนมัติ

Choosing between ArrayList and LinkedList - JEP Cafe #20