Rethinking Java String Concatenation

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 พ.ย. 2024

ความคิดเห็น • 27

  • @norbu_la
    @norbu_la 2 หลายเดือนก่อน +13

    Skipping through this, I didn't realize I know nothing about the simplest things

  • @mjduigou
    @mjduigou 2 หลายเดือนก่อน +10

    I am curious if you have a histogram of arity usage. If it were me I would be fine with there being a performance cliff at some point, perhaps at two standard deviations above the mean arity. The cliff serves as a warning to the app developer that probably should be using a different approach. There has to be a limit to how much effort to spend in making bad code perform well; give that effort to make the good and average code even faster.

    • @ClaesRedestad
      @ClaesRedestad 2 หลายเดือนก่อน +4

      I don't have any histograms to share, but most apps I've looked at have a lot of low-arity concats then a long tail. But then there are those apps that generate massive concat expressions on the fly and skew the picture. 😅
      I think the now integrated implementation strikes a good balance. Small expressions inline and optimize *very* well, then as things get larger regular inlining heuristics in the JIT will gracefully degrade performance without ever really falling off a cliff. More a gentle slope.

  • @andmal8
    @andmal8 2 หลายเดือนก่อน +1

    Thank you!

  • @dispatch-indirect9206
    @dispatch-indirect9206 2 หลายเดือนก่อน +2

    Interesting talk, thanks. If I'm generating Java code that needs to target JVMs before 23, and might generate a complex concatenation, what's a reasonable threshold to swap out the concatenation operator in favor of emitting a StringBuilder expression?

    • @ClaesRedestad
      @ClaesRedestad 2 หลายเดือนก่อน +2

      20? 😊 I'd suggest measuring. Appropriate thresholds might be different on some older JDK versions and on non-HotSpot VMs, so making it configurable is probably good.

    • @dispatch-indirect9206
      @dispatch-indirect9206 2 หลายเดือนก่อน

      @@ClaesRedestad Thanks, will definitely measure, but my guess was pretty close and it's nice to confirm you're seeing the blowup about where I thought it might be.

  • @yaderanibal
    @yaderanibal 2 หลายเดือนก่อน +4

    Yo soy fan de este lenguage, mi favorito.

  • @VuLinhAssassin
    @VuLinhAssassin 2 หลายเดือนก่อน

    After years, String is still a pain to work with in Java (not counting text block)

    • @zappini
      @zappini 2 หลายเดือนก่อน

      Which languages are better?

    • @VuLinhAssassin
      @VuLinhAssassin 2 หลายเดือนก่อน

      ​@@zappiniThe interpolation in .NET looks amazing

  • @hiEroneta
    @hiEroneta 2 หลายเดือนก่อน +4

    hope string interpolation kinda thing would exist in java soon.

    • @alecbg919
      @alecbg919 2 หลายเดือนก่อน +1

      look up string templates in java 21.

    • @_SG_1
      @_SG_1 2 หลายเดือนก่อน

      I assume that String interpolation would use this "low-level" String concatenation under the hood anyway.

    • @ClaesRedestad
      @ClaesRedestad 2 หลายเดือนก่อน +6

      ​@_SG_1 yes, this was briefly mentioned near the end of this talk. While the templates feature has been pulled out for now, if/when we're redoing it will benefit from the work we presented here, increasing confidence in the runtime side of it.

  • @hkupty
    @hkupty 2 หลายเดือนก่อน +10

    One of the things I miss in java is the ability to get immutable views from a string that don't incur in allocation. If I get a substring of an already immutable string, why does it have to allocate? So, in this sense, rust's mutable and immutable string types might be a good inspiration, though arguably requiring a better API for Java.

    • @sebastianb7496
      @sebastianb7496 2 หลายเดือนก่อน +7

      Allocation by the JVM is surprisingly fast already and doing substring this way would probably prevent garbage collection of the original string, so idk...

    • @vasiliigulevich9202
      @vasiliigulevich9202 2 หลายเดือนก่อน +4

      Substring method used to point to the original String, this causes memory leak, as substring held the original String referenced. This was fixed in JDK 1.7. Basically, the behavior you describe is a bug to be fixed, not a desired behaviour.
      Search "substring memory leak".

    • @hkupty
      @hkupty 2 หลายเดือนก่อน +1

      @vasiliigulevich9202 funny how what you're arguing is a bug is a language feature in other languages. This is a false equivalent, as the bug in previous implementation isn't necessarily the only solution to the problem, so you shouldn't consider the problem as "a bug to be fixed".
      I'm advocating for a string view, which could be a different class, instead of reusing the same class with an added responsibility.

    • @hkupty
      @hkupty 2 หลายเดือนก่อน +2

      @@sebastianb7496 it really depends on what you're doing and the order of magnitude we're talking about. Operations that happens tens or hundreds of times per request can be significantly impact a request handling by having tiny allocations and increased GC pressure.

    • @vasiliigulevich9202
      @vasiliigulevich9202 2 หลายเดือนก่อน +2

      @@hkupty String view has the same problem, if it holds a strong reference. Arguably, language could intercept garbage collection event and make a copy of substring whenever original is collected, but that would degrade GC performance. Languages with strict lifetime control can afford non-owning references at cost of dangerous or complex lifetime management.

  • @mchiareli
    @mchiareli 2 หลายเดือนก่อน +5

    We want String interpolation like every other cool language

  • @Sir_Ray_LegStrong_Bongabong
    @Sir_Ray_LegStrong_Bongabong 2 หลายเดือนก่อน

    Bonjour

  • @cmyanmar13
    @cmyanmar13 2 หลายเดือนก่อน +8

    I hated JEP 280 from the moment I heard about it. And now I know who to blame for it. It's SO OBVIOUSLY an absolutely grotesque abuse of dynamic class generation, SO OBVIOUSLY doomed to bog down startup and waste memory and clog the code cache. It has no redeeming value whatsoever and should be rolled back as the bug it is. Invokedynamic is the hammer that makes everything look like a nail. It prompted me to MANUALLY use StringBuilder more often, knowing that trusting `+` for concatenation is going to suck. Oh what's that, they had to install a workaround in the VM to use StringBuilder after all? No surprise. PATHETIC.

    • @ClaesRedestad
      @ClaesRedestad 2 หลายเดือนก่อน +18

      Ouch! I had (almost) nothing to do with JEP 280 initially, but have picked it up and worked on the implementation after it got integrated, reducing the footprint and runtime impact over the releases. I admit I've been thinking quite a few times - and probably argued internally - that we should roll this back entirely or make it opt-in on a number of occasions. Alas. Still, there have been a number of indirect benefits come out of persevering - including a great number of optimizations in several related and unrelated areas.
      The recent StringBuilder workaround in JDK 23 is ugly, yes, and prompted the prototype rework that this talk concludes around. (Now integrated in the main OpenJDK repo and slated for JDK 24).
      On the flip side, we now also have a very straightforward path to link- and assembly-time static code generation that will allow pre-generating any string concats while safely getting the peak performance benefits.