Interpreting hydropathy plots and hydropathy indexes

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ก.ย. 2024
  • Don’t be scared off by a hydropathy plot! It can help you predict membrane-spanning regions of proteins and more - I kid you not!
    blog: bit.ly/hydropa...
    I talk about hydrophobicity a lot in other posts so am not going to go into the why behind it here. So if you need to review I will (virtually) wait: bit.ly/hydropho... & • the hydrophobic effect  
    But for now just know that we can classify molecules as hydrophobic (water-avoided/excluded) and hydrophilic (water-loved/water-loving). Well, it’s not actually an either or - it’s a scale - so we can give molecules a score!
    This “score” of hydrophobicity is called the hydropathy index. Negative numbers correspond to hydrophilic molecules (and the more negative the more so) and positive numbers correspond to hydrophobic molecules (and the more positive the more so). To understand why, it helps to know where the heck they come from and what they represent.
    Basically, they’re experimentally-determined (i.e. empirical) measurements of how "happy" or "mad" the molecules were when researchers moved them from a greasy environment to a watery one. Often the values we are using for amino acids come from this thing called the Kyte-Doolittle scale which itself comes from multiple experiments. I talk a lot about the paper behind the scale in the video & have a link to it at the end of this post.
    How do you quantify whether a molecule is “happy” or “mad”? You look to the Gibbs free energy, G, which is a sort of measure of molecular comfort. The more comfy a molecule, the lower the free energy. If you take a comfortable molecule and make it less comfortable, you increase G, so the change in G (∆G) will be positive. So positive means you made it mad. Conversely, if you take an uncomfortable (high energy) molecule and make it more comfortable, you lower G, so your ∆G is negative. So negative means you made it happy.
    Applying this to our hydropathy index, if the number is negative, it's like a ball falling down a hill (favorable) and we say the molecule is hydrophilic. The larger the negative number, the more hydrophilic it is.
    If the number is positive, it's like having to push a ball up a hill (unfavorable) and we say the molecule is hydrophobic. The larger the positive number, the more hydrophobic it is.
    Amino acids with a negative hydropathy index include amino acids that are charged (such as arginine and glutamate) and/or polar (such as glutamine and asparagine), and they are likely found in watery environments (cytosol, extracellular environment etc.) on the outside of water-soluble proteins - we commonly refer to these regions as “solvent exposed” - though not that there can also be solvent-exposed regions inside of proteins because of things like solvent tunnels. Proteins are hardly solid blobs!
    Amino acids with a positive hydropathy index, on the other hand, include “nonpolar” amino acids like isoleucine, phenylalanine, leucine, valine, etc.) are often found in membranes or on the inside of water-soluble proteins proteins, where they are shielded from water (and the water doesn’t have to hang out with them).
    But that’s all for individual amino acids in isolation. And that’s not what you get after translation!
    In the process of translation, individual amino acids are linked together into chains that then fold up to make pretty proteins where the hydrophobic parts are sheltered away from water and the hydrophilic bits are hanging out with it. Since amino acids in chains can’t just decide on their own where to go, we need to consider the context and whether there are sufficient similarly-inclined amino acid residues around them that they can go as a team to a watery or greasy environment. So when we look at the protein level, we turn to hydropathy plots.
    Hydropathy plots graph the hydropathy index over a protein's length to help predict transmembrane domains & surface regions of proteins. Rather than just plot sequentially the indexes of individual amino acids, which would be super noisy and not very helpful, they use a sliding window to average the hydropathies of the individual amino acids in the neighborhood. The size of the window you use depends on what you’re trying to predict.
    Since it takes ~20 amino acids to pass through (traverse) a membrane in an alpha helix form, scientists commonly use a large (19 or 21*) amino acid sliding window size to predict transmembrane segments, which show up as broad positive peaks. This can also be used as a sort of test as to whether a protein is likely an integral membrane protein based on its sequence alone, and was how scientists were able to estimate that about 1/4 of our protein-coding genes (genes with instructions for making proteins) are recipes for integral membrane proteins¹.
    finished in comments

ความคิดเห็น • 8

  • @thebumblingbiochemist
    @thebumblingbiochemist  ปีที่แล้ว +1

    *you use odd numbers so that you have an even number on either side of the amino acid residue** you’re talking about. So, for example, if your window size is 21, when you look at the residue labeled “80” on the x-axis, the value you see plotted is the average hydropathy index of a window encompassing the 80th residue in the sequence and 10 amino acids on either side of it. A window size of 19 would be 9 on either size, etc. etc. etc.
    **since amino acids lose their acid part when they join together through peptide bonds, we can’t technically refer to them as amino acids anymore. Instead, we call them “amino acid residues” (referring to the residual stuff that’s left over) if we want to be precise
    Beware that if the solvent-exposed linker between transmembrane segments is short, the averaging can lead to the transmembrane segments blending together in the hydropathy plot, such as in the bacteriorhodopsin example. This protein has 7 transmembrane helices (a common occurrence), but the last 2 are hard to tell apart in the plot.
    As shown in the Kyte-Doolitle paper (Fig. 5, bottom panel) and pointed out in my video, these plots can also show parts of some proteins, often located near one of the termini, that anchor proteins to membranes.
    As long as the feature contains enough hydrophobic amino acids in a row, they can be found with a large window - which works well for membrane associated regions but not as well for trying to figure out what regions are on the inside or outside of your typical water-soluble protein…
    In that case, because proteins are super fold-y, regions that are close together in sequence could be facing very different environments - for example, one side of an alpha helix might be solvent-exposed and therefore contain hydrophilic amino acids, whereas the other side of the alpha helix could be shielded inside the protein and thus “need” to have hydrophobic ones. So you’d see hydrophobic & hydrophilic amino acids interspersed in the primary structure (sequence of amino acids). And that large window would average them all out. Rather than just smoothing out the noise, we’re erasing our signal! So we need to use a smaller window for our averaging. Often, window size of ~7 amino acids sliding window is chosen to likely on surface predict surface regions of (water-soluble) proteins.
    In practice, it can be helpful to plot with different sliding window sizes to try to maximize the signal/noise ratio. How, you might ask? There’s free software that lets you create & customize hydropathy plots for any protein - either from the protein’s unique UniProt accession number (a sort of social security number for proteins) or based on its sequence alone. There are actually multiple tools, but I like Expasy ProtScale (if the name Expasy sounds familiar, they have a ton of free tools, including ProtParam which I’ve discussed in the past - that one calculates isoelectric point (pI), extinction coefficient, and more). As I demo in the video, you can get directly to the ProtScale (and also ProtParam) page for a protein from that protein’s UniProt page - if you click on “Tools” under the sequence heading.
    Although the Kyte-Doolittle is the classic hydropathy scale that’s usually used, there are other hydropathy scales that do things like take the residue context into account (as opposed to simply considering values based on free-floating amino acids). Here’s a link to one such strategy from the Stephen White lab at UC Irvine, which has a tool called Membrane Protein Explorer (MPEx): blanco.biomol.uci.edu/hydrophobicity_scales.html
    I encourage you to play around with hydropathy plots for your favorite proteins. But know that seeing a positive peak doesn’t necessarily confirm where a region is located. For more evidence, you need experimental data, such as x-ray or cryo-EM structures. Or, if those aren’t available some sort of enzymatic test if such as whether a part of the protein suspected to be embedded in a membrane is protected from protease cleavage.
    If there are structures available, you can find and view them in the PDB (Protein DataBase), where you can also a hydropathy plot for them - as I show in the video, you can cross-reference between the plot peaks and the structure to see how well they correlate.
    I’ve left you with lots to think about and try out, so now I will leave you with lots of links as well! As always, hoped this post helped!
    Here’s that classic Kyte-Doolittle paper: Kyte, J., & Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. Journal of molecular biology, 157(1), 105-132. doi.org/10.1016/0022-2836(82)90515-0
    Here are PDB entries for a couple of the structures referenced in that paper:
    Lactate dehydrogenase (LDH), PDB 6ldh: www.rcsb.org/structure/6ldh
    * from Abad-Zapatero, C., Griffith, J. P., Sussman, J. L., & Rossmann, M. G. (1987). Refined crystal structure of dogfish M4 apo-lactate dehydrogenase. Journal of molecular biology, 198(3), 445-467. doi.org/10.1016/0022-2836(87)90293-2
    * Corresponding UniProt entry: P00341, LDHA_SQUAC www.uniprot.org/uniprotkb/P00341/entry
    Bovine chymotrypsinogen, PDB 1chg: www.rcsb.org/structure/1chg
    * from: Freer, S. T., Kraut, J., Robertus, J. D., Wright, H. T., & Xuong, N. H. (1970). Chymotrypsinogen: 2.5-angstrom crystal structure, comparison with alpha-chymotrypsin, and implications for zymogen activation. Biochemistry, 9(9), 1997-2009. doi.org/10.1021/bi00811a022
    * Corresponding UniProt entry, P00766, CTRA_BOVIN: www.uniprot.org/uniprotkb/P00766/entry
    And here’s one for a membrane protein I talk about in the video, bacteriorhodopsin, PDB 1fbb: www.rcsb.org/structure/1fbb
    * from: Subramaniam, S., Henderson, R. Molecular mechanism of vectorial proton translocation by bacteriorhodopsin . Nature 406, 653-657 (2000). doi.org/10.1038/35020614
    * Corresponding UniProt entry, P02945, BACR_HALSA www.uniprot.org/uniprotkb/P02945/entry
    Direct link to Expasy ProtScale which allows you to make hydropathy plots (as well as a bunch of other types of plots): web.expasy.org/protscale/
    reference for membrane protein stat: ¹Arinaminpathy, Y., Khurana, E., Engelman, D. M., & Gerstein, M. B. (2009). Computational analysis of membrane proteins: the largest class of drug targets. Drug discovery today, 14(23-24), 1130-1135. doi.org/10.1016/j.drudis.2009.08.006  
    Here are links to some relevant posts & videos of mine for background and/or further information:
    * more about the PDB and structures: blog: bit.ly/pdbstructures ; full video: th-cam.com/video/NgXwP7gGPyA/w-d-xo.html short video: th-cam.com/video/1uKC08Z_lYQ/w-d-xo.html
    * more on UniProt, ProtParam, etc. blog: bit.ly/uniprotprotparam ; TH-cam: th-cam.com/video/6oBsTykEeGI/w-d-xo.html
    * more about hydrophobicity and the hydrophobic effect: bit.ly/hydrophobiceffectPSA & th-cam.com/video/CJWEWrwUXI4/w-d-xo.html  
    * more about membrane proteins: Blog: bit.ly/membraneproteinbiochemistry had; TH-cam: th-cam.com/video/uoXu1EF6atc/w-d-xo.html  
    * more about amino acids and proteins: bit.ly/aminoacidsposts & th-cam.com/play/PLUWsCDtjESrFQoCEsEmZX6NxnwlHzjHZ6.html
        
    more about all sorts of things: #365DaysOfScience All (with topics listed) 👉 bit.ly/2OllAB0 or search blog: thebumblingbiochemist.com

  • @nargeszare1109
    @nargeszare1109 ปีที่แล้ว +1

    The most comprehensive explanation I've ever found on this subject. Thank you!

  • @staycold3130
    @staycold3130 หลายเดือนก่อน

    How do you determine if the Amino or carboxyl is inside or outside the cell based on the hydropathy index?

  • @kwameokrah7662
    @kwameokrah7662 ปีที่แล้ว +1

    Thank you so much!

  • @brightafterrain9115
    @brightafterrain9115 6 หลายเดือนก่อน

    thank you. you would make an amazing prof