When You Shouldn’t Remove Code Duplication (And How to Refactor the Right Way)

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ม.ค. 2025

ความคิดเห็น • 91

  • @ArjanCodes
    @ArjanCodes  2 หลายเดือนก่อน +1

    💡 Learn how to design great software in 7 steps: arjan.codes/designguide.

  • @OzFush
    @OzFush 2 หลายเดือนก่อน +16

    I’m a big fan of the Rule of Three - you don’t know what the abstraction should look until you have three examples, so duplicating something once is good, then refactor instead of duplicating it again.

  • @evlezzz
    @evlezzz 2 หลายเดือนก่อน +23

    One more case when DRY do not apply is accidental duplication. It happens when two independent tasks in the moment could be implemented with identical code, but has different reasons to change.

    • @ApprendreSansNecessite
      @ApprendreSansNecessite 2 หลายเดือนก่อน +4

      That's a huge point, the principle of bounded context helps here.

  • @DBICTS_BV
    @DBICTS_BV 2 วันที่ผ่านมา

    Your example of the calculate_any_volume()-function reminded me of what my old professor used to call "control coupling". It was part of his lesson about "low coupling, high cohesion". I still consider that to be the basic principle of code design.

  • @dalehagglund
    @dalehagglund 2 หลายเดือนก่อน +1

    Nice video, as always, Arjan. You're certainly right that it can be hard to figure out how to extract the commonality from similar-but-not-quite-identical blocks of code. One technique I sometimes use is to slowly *introduce* code to increase the similarity of the blocks until they're true duplicates at which time extracting the duplication is straightforward. (I also like the example of trying too hard to eliminate duplication.)

  • @SubwayToSally90
    @SubwayToSally90 2 หลายเดือนก่อน +6

    At my company we started to avoid "DRY" quite a while ago, because not so experienced developers tried to take it by hearth and prematurely abstract EVERYTHING to a point where functions where not abvious if they where ther right fit for a case or not.
    We now rather follow "WET" -> write everything twice. Its fine to have a code duplication. The cases might be similar but not similar enough to be worth an extraction and ot might not even be worth the time to propperly abstract it (yet). Do you come by the same code a third time? Great, now is a good time to abstract it and reuse it a bit.

  • @FilipePauloMax
    @FilipePauloMax 2 หลายเดือนก่อน +7

    As a software engineer I can argue that the "calculate_any_volume" function is not a good generalisation. It's way harder to read/understand than the simple ones. In that specific example, I'd trade out a bit of duplication over fully understanding what the function is doing.

  • @virtualraider
    @virtualraider 2 หลายเดือนก่อน +9

    Someone in the Golang community jokingly suggested the WET principle: Write Everything Thrice to understand the patterns before removing duplication. Not a bad idea 😁

    • @ArjanCodes
      @ArjanCodes  2 หลายเดือนก่อน +1

      I like that! 😊

    • @SubwayToSally90
      @SubwayToSally90 2 หลายเดือนก่อน +2

      Just have seen your comment. I basically wrote the same in a different comment and we do the same (even tho we name it "write eveything twice") and only abstract the third time it comes around. Makes it easier & helps especially with unexperienced developers who "see the possiblity of a reuse" of something and directly abstract everything up 2 levels and lead to large functions with edge cases that never happens. :D
      It's a realy good principle. I like WET over DRY every day now.

  • @dankprole7884
    @dankprole7884 2 หลายเดือนก่อน +4

    I have got myself into trouble removing duplicate code and doing abstractions way too early before i really knew what they should look like. Being a neat freak has its issues. Code always runs though!

  • @tascsolutions6483
    @tascsolutions6483 2 หลายเดือนก่อน +1

    Always revisiting your code is key, especially when starting out and learning a language. As I got better with functions, I was able to de-dupe hundreds of lines of code. I found the quickest and easiest deduping related to plotting/matplotlib.

  • @resresres1
    @resresres1 2 หลายเดือนก่อน +2

    As someone who is not a python novice, i appreciate the more complex example.

  • @yehoshualevine
    @yehoshualevine หลายเดือนก่อน +1

    Nice! I hope you'll do another video about partial functions vs DRY principles: The `caculate_any_volume` of this video could be turned into the other volume func's as partials.
    In real-world code, what wold be some techniques, advantages and drawbacks to approaching DRY with partial functions? (But calculating a volume is laughably too contrived to illustrate the point!)

  • @FridolinRath
    @FridolinRath 2 หลายเดือนก่อน

    Hi Arjan, once you mentioned to use more pathlib :P. I just remembered when I watched your video. Maybe there is even more refactoring possible. Greetings from Germany.

  • @shubhambindlish1124
    @shubhambindlish1124 2 หลายเดือนก่อน +8

    the duplication becomes a problem around deadlines - ppl generally copy paste code and get to the deadline - focus on clean code is really really low..
    one trick that we use is - after every release generally 3-4weeks - the next 3 days are spent on just reading the code and spotting these issues - feel thats a good way to spot where things are complicated / repeated..

    • @radeksmola3422
      @radeksmola3422 2 หลายเดือนก่อน +2

      Deadlines are killing creativity and lower cognitive ability. I do not like deadlines and stress during developing process.

    • @shubhambindlish1124
      @shubhambindlish1124 2 หลายเดือนก่อน

      @@radeksmola3422 i hear you.. but without deadlines, i dont think i would ever end up shipping a release.. there will always be something that i can improve / iterate.. :)

    • @muyou0107
      @muyou0107 2 หลายเดือนก่อน

      @@radeksmola3422But deadlines are necessary for productivity. That’s how things are done in time.

  • @DougRansom1
    @DougRansom1 2 หลายเดือนก่อน +2

    So many programmers duplicate code for loops over and over again, when those loops are already available in map, reduce, and filter.

  • @deoradh
    @deoradh 2 หลายเดือนก่อน +1

    Many devs (people in general) have a hard time telling the difference between contract and coincidence. That’s driven too many PR complaints about duplicated code, even across boundaries that need to be maintained.

  • @philadams9254
    @philadams9254 หลายเดือนก่อน

    But now the *validate_dimensions()* call is duplicated many times. Is it not cleaner to validate the inputs before calling the function?

  • @ronmanders
    @ronmanders 2 หลายเดือนก่อน

    A different take on this: You could also argue that making code too generic often results in a function/module/whatever that simply gets too many responsibilities. Then it either gets too complex, or too abstract, or both. So then it violates the single responsibility principle. So you could say that a function that calculates any shape has multiple responsibilities. And in case of conflicting principles, it becomes a balancing act between them.

  • @giorgioripani8469
    @giorgioripani8469 2 หลายเดือนก่อน

    Another issue or the last example is the evaluation time or the function, since boolean ops are the worse in terms of big O timings

  • @carl2488
    @carl2488 2 หลายเดือนก่อน

    What is doing the online auto complete here for Arjun?

  • @pragdave
    @pragdave 2 หลายเดือนก่อน +1

    Just an observation. In the book, I wrote that every piece of _knowledge_ should have a single place of expression. Code is one representation of knowledge, but not the only one. It just happens to be the easiest one to spot :)

  • @rydmerlin
    @rydmerlin 2 หลายเดือนก่อน

    The video starts but I don’t see the IDE giving you any hints about code duplication. Why is that?

  • @hcubill
    @hcubill 2 หลายเดือนก่อน

    Great concepts thanks Arjan

  • @_DRMR_
    @_DRMR_ 2 หลายเดือนก่อน +2

    And when you end up duplicating code between projects it's time to start your own library ;)

  • @owenlu8921
    @owenlu8921 2 หลายเดือนก่อน +1

    Thank you so much for this video!

    • @ArjanCodes
      @ArjanCodes  2 หลายเดือนก่อน

      You’re welcome!

  • @2broke2code
    @2broke2code 2 หลายเดือนก่อน

    Love your work

    • @ArjanCodes
      @ArjanCodes  2 หลายเดือนก่อน

      Thank you, glad it’s helpful!

  • @AndreaDalseno
    @AndreaDalseno 2 หลายเดือนก่อน

    It's a great video, as usual, but the audio has poor quality this time!

  • @flavio4923
    @flavio4923 2 หลายเดือนก่อน +1

    when you remove duplication, you can make things cleaner, but if you refactor things too much you increase coupling and decrease explainability

  • @alexivanov4157
    @alexivanov4157 2 หลายเดือนก่อน +15

    Don't use DRY KISS and stay SOLID!

    • @ArjanCodes
      @ArjanCodes  2 หลายเดือนก่อน +8

      It’s pretty clear that you truly GRASP design principles!

    • @SentinelaCosmica
      @SentinelaCosmica 2 หลายเดือนก่อน

      remove this flag there..come on

    • @tankmohit
      @tankmohit 2 หลายเดือนก่อน +1

      I use WET KISS

    • @josmitube
      @josmitube 2 หลายเดือนก่อน

      You can say that again!

  • @MMarcuzzo
    @MMarcuzzo 2 หลายเดือนก่อน +2

    Removing duplication and unifying things are great when all those stuff change together. Otherwise, be careful

  • @ramimashalfontenla1312
    @ramimashalfontenla1312 2 หลายเดือนก่อน

    Great video!

    • @ArjanCodes
      @ArjanCodes  2 หลายเดือนก่อน

      Thanks!

  • @radeksmola3422
    @radeksmola3422 2 หลายเดือนก่อน

    I like abstraction for calculation of area, but it is hard to read by looking at function call what is happening because general name of function. And by reading it from parameters is not easy.

    • @radeksmola3422
      @radeksmola3422 2 หลายเดือนก่อน

      Ha ! you are explaining it ...

    • @DrDeuteron
      @DrDeuteron 2 หลายเดือนก่อน

      timestamp?

    • @radeksmola3422
      @radeksmola3422 2 หลายเดือนก่อน

      @@DrDeuteron it was example of bad deduplication. I wrote here too early.

  • @andrzejostrowski5579
    @andrzejostrowski5579 2 หลายเดือนก่อน

    Don’t use f-strings with logging! String manipulation takes a lot of resources and if you use an f-string, you often need to discard the result of that operation if your logger is not configured. Or you don’t use given log level. If do this in a loop, it really adds up.

  • @kiraleskirales
    @kiraleskirales 2 หลายเดือนก่อน +2

    You could remove the duplication in your example of calculating the volume / area without losing readability.
    1) Create a function that calculates the volume of a shape given its bounding box dimensions and its ratio volume/volume of bounding box. You can also inserts checks here.
    2) Define the function to calculate the area of known shapes by calling the function above, or using functools.partial and give intuitive names.
    This makes it also very easy to extend to new shapes.

  • @Cleanblue1990
    @Cleanblue1990 2 หลายเดือนก่อน +1

    I think the validate_dimensions function is too unclear. If it crossed my way, I wouldn't know that it raises an exception. I would have read the explicit implementation quicker. In part, this might be solved with a different name like 'raise_if_any_negative'.
    And that suggests the shorter implementation
    if any(x>0 for x in dimensions):
    raise Value error(...)

    • @imyafamiliya2155
      @imyafamiliya2155 2 หลายเดือนก่อน

      Less than zero I think. Also decorator would fit better here IMO

  • @jonathanpiaget5195
    @jonathanpiaget5195 2 หลายเดือนก่อน

    When I started web development with Django, I would often shoot myself in the foot by trying to create mixins to avoid duplication in views or forms. I always regretted it later 😂

  • @HanWechgelaer
    @HanWechgelaer 2 หลายเดือนก่อน +1

    I definitely copy multiple instances of the same code in my projects (copy paste alter) which results in 'quick and dirty' solutions amd of course bugs, after that I look to my code from a 'what is it doing exactly' and then optimize it by removing duplicate code.

  • @Henry-sv3wv
    @Henry-sv3wv 2 หลายเดือนก่อน

    Someone said: Abstraction is DRY, the "compression of code".

  • @marcotroster8247
    @marcotroster8247 2 หลายเดือนก่อน

    I think DRY is even more important for test code to avoid duplication in the test setup logic. You should show that on another episode.
    For production code, IMO a vertical slice architecture with proper BDD tests is fine. Some duplication buys the ability to throw the feature away without any coupling to consider. In big systems, coupling is the true painpoint anyways.

  • @johntamplin
    @johntamplin 2 หลายเดือนก่อน

    DRY, like many rules, should be subject to the quote from Douglas Bader "Rules are for the obedience of fools and the guidance of wise men."
    Unfortunately, everyone thinks they are wise.

  • @dwhall256
    @dwhall256 2 หลายเดือนก่อน

    I most commonly encounter code duplication between large functions. The two functions do something different, but have a common set of steps between them. And there's usually one small but difficult thing that prevents easy de-duplication. For this reason, don't let a function's size get out of hand during its creation. Once it reaches a screen height, time to factor something out. Smaller functions are easier to re-use than larger ones, preventing future duplicate code from appearing.

    • @yojou3695
      @yojou3695 2 หลายเดือนก่อน

      if the functions do something different, they should be kept separate. And maybe the common steps you can abstract into a different thing.
      People are way too scared about long code, having it all in the same place, as long as it is understadable is okay.

  • @Fasyle
    @Fasyle 2 หลายเดือนก่อน

    The constant camera zooming is awful. I know someone is trying to draw me in or something, but it's waaaay over used.

  • @QueirozVini
    @QueirozVini 2 หลายเดือนก่อน

    So, DRY, unless it breaks SRP.

  • @BrianStDenis-pj1tq
    @BrianStDenis-pj1tq 2 หลายเดือนก่อน +1

    The example of refactoring the calculations of area was interesting. Not only did it make the code more complicated, you inserted a for loop which makes it much slower. If you think multiplying a few things together needs to be refactored, you are the type that may over use the DRY principle. Let me say that again...

  • @SentinelaCosmica
    @SentinelaCosmica 2 หลายเดือนก่อน +1

    I am learning multi-agent ai to develop an earlier sepsis detection system using bayes network, I have good understanding of database distributed systems and network communication, what is my odds?

  • @jasonpease7831
    @jasonpease7831 2 หลายเดือนก่อน

    I'm stealing DAMP! 😀

    • @ArjanCodes
      @ArjanCodes  2 หลายเดือนก่อน

      Go right ahead 😁

  • @patrykforyszewski4655
    @patrykforyszewski4655 2 หลายเดือนก่อน

    DRR. Don't Repeat Repostory ;)

  • @juanpyusun
    @juanpyusun 2 หลายเดือนก่อน

    🎉

  • @shawnedwards5369
    @shawnedwards5369 2 หลายเดือนก่อน

    "...not what you see in production code." Wanna bet?

    • @HansBezemer
      @HansBezemer 2 หลายเดือนก่อน +1

      True - "get it working and get it out" is often more important than optimizing the code. *Tip:* overestimate your maintenance jobs. It buys you some time to reevaluate your choices. And never violate the basic architecture. It'll save you a lot of time later on.

    • @shawnedwards5369
      @shawnedwards5369 2 หลายเดือนก่อน

      @@HansBezemer During sprint planning, and _especially_ when assigning points, I make it clear that I'll be doing testing and cleanup as part of the ticket.
      An extra hour of thoughtful refactoring and cleanup can (and likely will) save you days of debugging and fixing.

  • @sami3592
    @sami3592 2 หลายเดือนก่อน

    I like the joke. 😊

  • @asagiai4965
    @asagiai4965 2 หลายเดือนก่อน +1

    There is a reason it is called DRY not DRC
    So there are exceptions to the guide.
    But if you find yourself repeating a lot.
    Either
    A.) There is no solution for that language.
    Or
    B.) The current code is much better than the solution.

    • @efovex
      @efovex 2 หลายเดือนก่อน +1

      Also because that would be the Democratic Republic of the Congo

    • @asagiai4965
      @asagiai4965 2 หลายเดือนก่อน

      @@efovex haha true

    • @HansBezemer
      @HansBezemer 2 หลายเดือนก่อน

      If your "generalization" needs to many parameters - or if it requires suspect constructions like lots of references (instead of values), reevaluate your choices.

    • @asagiai4965
      @asagiai4965 2 หลายเดือนก่อน

      @@HansBezemer adequately enough parameters. And who uses too much references in a construction? I guess people who wants problems.

  • @JeanMarieGalliot
    @JeanMarieGalliot 2 หลายเดือนก่อน +2

    The only problem is that your examples are too complex, often involving video processing, and that kind of stuff that not everyone is familiar with. So, it makes it more difficult to grasp the design concept you want to put forward. I know it is certainly more difficult to crave simpler examples, but it will be more understandable for me, at least. I hope you will consider that as a constructive comment as I recognize and appreciate your expertise.

    • @radeksmola3422
      @radeksmola3422 2 หลายเดือนก่อน +5

      In opposite, I appreciate more complex examples than easy one for beginners.

    • @paulschockner1445
      @paulschockner1445 2 หลายเดือนก่อน +1

      I agree. Sometimes the details of these more complex examples does complicate the understanding of the concepts. Maybe the solution should be alternating between a simpler straightforward example and a more realistic example

    • @JeanMarieGalliot
      @JeanMarieGalliot 2 หลายเดือนก่อน

      @radeksmola3422 Being not familiar with video processing doesn't not imply being a beginner in programming...

    • @radeksmola3422
      @radeksmola3422 2 หลายเดือนก่อน

      @@JeanMarieGalliot I understand, but in the example was processing some text files for subtitles.

    • @manuelstausberg8923
      @manuelstausberg8923 2 หลายเดือนก่อน +2

      I also really appreciate the slightly more complex example here, as it highlights the different kinds of duplication (especially the not immediately obvious duplication).
      IMO you don't really have to know anything about video processing to understand that the duplication is found in the way that files / directories are handled :)

  • @chrisw1462
    @chrisw1462 2 หลายเดือนก่อน +1

    Complex code using libraries not everyone would use plus the scrolling jerking around... not very helpful for someone trying to learn to code.

    • @efovex
      @efovex 2 หลายเดือนก่อน

      This is a video about the pitfalls of the "DRY" mantra in complicated real-life scenarios. If you're a beginner, you can watch beginner level stuff...

  • @DrDeuteron
    @DrDeuteron 2 หลายเดือนก่อน +1

    WET: Write Everything Twice. can get
    WETTT: Write Everything Ten Thousand Times.
    Rule of Three notwithstanding.

  • @asagiai4965
    @asagiai4965 2 หลายเดือนก่อน

    There is a reason it is called DRY not DRC
    So there are exceptions to the guide.
    But if you find yourself repeating a lot.
    Either
    A.) There is no solution for that language.
    Or
    B.) The current code is much better than the solution.