Why isn't Postgres using my index? | Postgres.FM 085 |

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ม.ค. 2025

ความคิดเห็น • 12

  • @kirkwolak6735
    @kirkwolak6735 11 หลายเดือนก่อน +2

    We noticed some of our most important queries in PG16 had improved performance!

  • @ilyaportnov181
    @ilyaportnov181 11 หลายเดือนก่อน +2

    a couple of reasons I think deserve mentioning in this context:
    * there are different operator families, and they support different operators. For example, default btree index on text field does not support `like` operator; you have to do either `collate "C"`, or text_pattern_ops; on the other hand, text_pattern_ops does not support inequaltiy comparation () and sorting.
    * there are different collations; if your index is, for example, `collate "C"`, but you do comparasion by equality with default collation, the index will not work - you have to specify collation in the query explicitly or rebuild the index with another collation.
    * and, talking about selectivity / cardinality, there is a more difficult type of problems, when PG can not correctly calculate cardinality because of several joins: it can calculate cardinality of join result when you join two tables, but then when you join the result of join with the third table, it will probably not be able to calculate cardinality correctly. In this case I don't know a simple way to fix such problem, apart of rewriting one query into several or using materailized views or smth like that. Because of cardinality miscalculation, PG can select totally wrong sequence of joins, and because of that it will not use index... You can try to force the order of joins by use CTEs with `materialized` keyword to make an optimization barrier. Or even switch to Max Boguk's hardcore techniques with recursive CTEs :)

  • @agarbanzo360
    @agarbanzo360 11 หลายเดือนก่อน +1

    Why doesn’t Postgres do some detection of the disk type and do basic, deterministic self tuning?

    • @NikolaySamokhvalov
      @NikolaySamokhvalov 11 หลายเดือนก่อน

      Good question. It even has no idea how many CPU cores and GiB of RAM are available. I think there is potential for some tuning module to be developed - and TimescaleDB has it, for example (and many of its things can be applied to non-timescale setups)

    • @agarbanzo360
      @agarbanzo360 11 หลายเดือนก่อน

      Didn’t even think of that. Something super simple like worker_mem = parameter * memory available, etc would be a huge improvement

  • @marcinbadtke
    @marcinbadtke 11 หลายเดือนก่อน +1

    Thank you for the conversation.
    It is hard to imaging for me that disk performance is the main reason database engine chooses to use index or not. As far as I know during sequential scan many database blocks are read in one IO operation. On the other hand random read reads only one database block.
    In my opinion using index or not is decided based primarily on statistics. Index is not used when cost calculation based on statistics shows that not using index is optimal. E.g. statistics show that amount of data a query tries to get is so big that it is cheaper to do sequential scan.

    • @PostgresTV
      @PostgresTV  11 หลายเดือนก่อน +1

      Thanks. Good question. I know, it might be counter-intuitive, but it is as it is - I see it quite often (ofc, not for trivial single-row PK lookups). Detailed answer: twitter.com/samokhvalov/status/1761082969001972050 // Nikolay

    • @marcinbadtke
      @marcinbadtke 11 หลายเดือนก่อน

      @@PostgresTV thank you

  • @mbanck
    @mbanck 11 หลายเดือนก่อน +1

    Current versions of hypopg an also mask an index, for the "would it pick my index if the other one does not exist?" question

  • @wstrzalka
    @wstrzalka 11 หลายเดือนก่อน +2

    It's still 4 on RDS. And when raised to their support the answers was it's not related to hardware and I should set it myself to whatever I want :)

    • @PostgresTV
      @PostgresTV  11 หลายเดือนก่อน

      🤷