How to read and normalize microarray data in R - RMA normalization | Bioinformatics 101

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ม.ค. 2025

ความคิดเห็น •

  • @saswatsatapathy658
    @saswatsatapathy658 2 ปีที่แล้ว +2

    Oh my god! I was breaking my head on the internet to extract these tar files. You made it so simple. Thank you life-saver !!

  • @marilyngomes6684
    @marilyngomes6684 5 หลายเดือนก่อน +1

    I LOVE YOU!!!! I've been struggling with this for a few days and your video is such a lifesaver!

  • @maert95
    @maert95 6 หลายเดือนก่อน

    This is so useful, literally saving whole days of googling and trial and errors. Thank you so much!

  • @divz2646
    @divz2646 ปีที่แล้ว +1

    Dude you ahve no idea how helpful you are!! i want to do PhD and before that wanted to learn these bioinformatics and coding skills so that it will help me in future, i did all the normal Python and R classes but never understood the real life use of these in research, today i got some clarity at least. SO THANKSSS Also can you make more such video more such videos and tutorials for food technology perspective it would be grateful.

  • @ProfessorSanoar
    @ProfessorSanoar 10 หลายเดือนก่อน

    This is great helpful. Please upload more videos about .CEL files normalization and integration for many datasets

  • @harshulkapoor6672
    @harshulkapoor6672 10 หลายเดือนก่อน +1

    can u plz make a demo tutorial on using Limma for differential expression of microarray data

  • @enriquep4857
    @enriquep4857 2 ปีที่แล้ว +1

    Nice video! Could you please make one doing the next steps for Microarray analysis? (I mean, filtering data, DE analysis, VolcanoPlot and so on...). Thank you so much. New follower here!

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +2

      Thank you for the suggestion. I shall surely plan on making a video on it soon :)

  • @Hiro_Kobayakawa
    @Hiro_Kobayakawa 4 หลายเดือนก่อน

    Thank you, great video!
    Could you please help me with some issues i'm having: I'm trying to execute the annotation (map genes) of a custoum microarray, but with the raw data. That is, I want to make a table with gene names in rows and samples (CEL files) on the columns. Is there a way to execute this procedure of annotation whitout use the normalization functions (like rma or mas5)? Thank you for the attention!

  • @DoctorLoganPhD
    @DoctorLoganPhD 5 หลายเดือนก่อน

    Thank you for this video. How can I perform this kind of analysis on several datasets at once, to boost the number of samples being analyzed? For example, in this video, there were 4 samples in the breast cancer dataset. How would you add more breast cancer datasets to get a larger sample size and a stronger P value? Thank you!

  • @Bricks874
    @Bricks874 2 ปีที่แล้ว +1

    it was so clear and you are amazing) thank you!

  • @hunny1215
    @hunny1215 9 หลายเดือนก่อน

    can we run multiple geo datasets for this r script or do it individually.(all datasets are of affymetrix)

  • @user-calexe
    @user-calexe 2 ปีที่แล้ว +3

    Thank you for this video! May I ask if you were able to account for probes that target similar genes? I noticed in my analysis that there are genes that appeared more than once which could mean that I still need to probably summarize them (i.e., means). Do you perhaps know how to do that?

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +1

      Usually I group by genes and take a rowSum and keep the row with max rowSum.

    • @user-calexe
      @user-calexe 2 ปีที่แล้ว

      @@Bioinformagician how do you exactly do this? I'm sorry. I'm kinda new to R

  • @chriskuo
    @chriskuo ปีที่แล้ว

    @bioinformagician thank you so much for these wonderful tutorials. I was wondering if you could please make another video showing downstream analysis of affymetrix data?

  • @aewe4239
    @aewe4239 2 ปีที่แล้ว +1

    This is so wonderful...I enjoyed your video clips...so informative. Just wondering if you have video clips on methylation data analysis and chipseq. Thank you!

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      I am glad you find my videos informative. Yes, I shall be making videos covering epigenetic data analysis in the future :)

  • @amtd3808
    @amtd3808 10 หลายเดือนก่อน

    how to make models for LIMMA analysis from above normalized data ?

  • @sonialamba2767
    @sonialamba2767 2 ปีที่แล้ว +2

    Error in getGEOSuppFiles("GSE148537") : Failed to download C:/Users/hp/Documents/GSE148537/GSE148537_RAW.tar!
    while accessing the GEO dataset, I am getting this error...kindly guide, how to fix this issue?

    • @rnepepperoni
      @rnepepperoni 9 หลายเดือนก่อน

      For others if they get same issue:
      library(curl)
      options(timeout = max(300, getOption("timeout")))
      options(download.file.method.GEOquery = "curl")

  • @ahmedal-mammari9639
    @ahmedal-mammari9639 2 ปีที่แล้ว +1

    It's really helpful, thank you so much

  • @dkvlogs8200
    @dkvlogs8200 10 หลายเดือนก่อน

    Hey that inner join last steps is not working can u suggest

  • @aakanshasingh7154
    @aakanshasingh7154 2 ปีที่แล้ว +1

    Thanks a lot for this video. Its really very helpful. Can you please make a video on differential expression analysis of microarray data using limma package? Thank You.. and looking forward to such wonderful videos :)

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +1

      Thank you for the suggestion, I will surely plan on making a video on it :)

  • @Dimitra589
    @Dimitra589 ปีที่แล้ว +1

    Thank you for this helpful tutorial! Can you explain why in some cells there are more than one genes separated with "///" and what we should do about them?

  • @socy8232
    @socy8232 ปีที่แล้ว

    Is there any way to sort the samples in the x axis into "cancer" and "normal tissue"? Thanks !

  • @ErinaSharmin
    @ErinaSharmin ปีที่แล้ว

    How to retrieve these Normalized.expr file from environment?

  • @ahmedal-mammari9639
    @ahmedal-mammari9639 2 ปีที่แล้ว

    please can you answer me, what about if data was txt format, i mean 10 file of GSM......txt how i can open it

  • @uzoamakaotutu7450
    @uzoamakaotutu7450 8 หลายเดือนก่อน

    Really helpful video!. Thanks. I'm trying to do gene expression analysis and I'm new to R, What do i do next. Please help.

  • @thirupugal
    @thirupugal ปีที่แล้ว

    can you show me how to fix this error feature.data

  • @jyoti9426
    @jyoti9426 2 ปีที่แล้ว

    Thanks for the video :) Helped a lot!

  • @Asas11Shourov
    @Asas11Shourov 2 ปีที่แล้ว +1

    Excellent video. Could you please make a video on the downstream processes in DGE analysis after annotating the microarray data?

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      You mean gene set enrichment analysis or pathway analysis using differentially expressed genes?

    • @Asas11Shourov
      @Asas11Shourov 2 ปีที่แล้ว +1

      @@Bioinformagician Thanks for your reply. I was wondering if you can guide us on how to run DGE analysis on the annotated data like using the LIMMA package. 😊

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +1

      I will surely plan on making a video covering that. Thanks for the suggestion :)

    • @jithus89
      @jithus89 2 ปีที่แล้ว

      @@Bioinformagician is DGE analysis of micro array data is available?

    • @faizaasghar9757
      @faizaasghar9757 ปีที่แล้ว

      ​@@Bioinformagician please use the same data set for differential gene expression. Thanks

  • @OmaymaAlSaei
    @OmaymaAlSaei 10 หลายเดือนก่อน

    Thanks a lot, @Bioinformagician. Could you please show us how to use Limma for the differential expression of microarray data?

  • @Nadia-db6nb
    @Nadia-db6nb 2 ปีที่แล้ว +1

    Hi, I'm using the data from GEO (GSE70947) but my raw.tar file is not in .CEL instead in .txt. How can i read the .txt file using readaffy function? And i wanted to run normalization, how can i do so?

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      If you click on any one sample and read under 'Data Processing', the values in the text files are already background-corrected and quantile normalized. They should be good to use as is for any downstream analysis.

  • @setarehsohail5422
    @setarehsohail5422 2 ปีที่แล้ว

    It is amazing way for teaching! thausend thanks!

  • @ningsong4475
    @ningsong4475 ปีที่แล้ว

    Nice video! I got some *.txt file but not .cel files, what should I do to normalize these data?

  • @RupeshKumar-mq4eu
    @RupeshKumar-mq4eu 2 ปีที่แล้ว

    Thanks a lot for this video . Please can you help with this error am facing with my data at the time of mapping probe id with gene name
    Using locally cached version of GPL570 found here:
    C:\Users\RUPESH~1\AppData\Local\Temp\RtmpM3ouSD/GPL570.soft
    Error in `.rowNamesDF

  • @keshavraghuwanshi1242
    @keshavraghuwanshi1242 ปีที่แล้ว

    ma'am, is this all possible in Python?

  • @isabelacristinadesouza
    @isabelacristinadesouza ปีที่แล้ว

    Hi, I working with some datasets from GEO (GSE686), and I wondering if it's correct to perform RMA normalization and than filter some samples that I want (ex: tumor samples from oral cavity only) or do I need to selected the samples first and performe the normalization? Thank you and congrats for your work, the tutorials are very clear and it's helping me a lot!

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว

      As per my understanding, it makes sense to perform RMA normalization on the entire dataset. This will adjust for variations in probe intensities and background noise across samples. Filtering samples before normalization will not account for the technical differences between samples, potentially leading to biased or incorrect results.

  • @KhanapeenaorGupShup
    @KhanapeenaorGupShup 2 ปีที่แล้ว

    hello dear, hope you are doing well. I am practicing this and following you step by step using WINDows operating system. Unfortunately im not able to execute last one that mentions rows_to_columns. Can you please guided me?

    • @KhanapeenaorGupShup
      @KhanapeenaorGupShup 2 ปีที่แล้ว

      and also what are the next steps differential expression. Kindly share the next steps aswell. Im a PhD. Scholar and I really want guidance.

    • @KhanapeenaorGupShup
      @KhanapeenaorGupShup 2 ปีที่แล้ว

      normalized.expr %
      rownames_to_column(var ='ID') %>%
      inner_join(., feature.data, by = 'ID')
      Error in normalized.expr %>% rownames_to_column(var = "ID") %>% inner_join(., :
      could not find function "%>%"

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      The errors relating to rows_to_columns and "could not find function %>%" is because you are missing a library called 'tidyverse'. Loading the library(tidyverse) first before running the commands will prevent these errors.
      The steps following differential expression depends on the goal of your analysis. Now that you have differentially expressed genes, what you do want to know about these differentially expressed genes? What biological question you are trying to answer?
      Typically, differential expression is used when comparing samples from two conditions or when one is trying to identify the effect of treatment on a sample. One could use these differentially expressed genes to get an idea on what pathways are enriched between two conditions or between a treatment group and control group. Analysis like gene ontology enrichment or gene set enrichment analysis are usually performed to get an idea of what GO terms/pathways are enriched.

    • @KhanapeenaorGupShup
      @KhanapeenaorGupShup 2 ปีที่แล้ว

      @@Bioinformagician dear I want to know the genes that are upregulated under abiotic stress in wild v/s domestic plant specie. The microarray data contains wild control, wild stress, domestic control and domestic stress. Can you please guide me in this case? I want to know the set of genes that are more upregulated in wild species than in domestic ones under abiotic stress conditions.

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      @@KhanapeenaorGupShup You need to perform a differential gene expression analysis to find genes upregulated in one group compared to another.

  • @kantnesragnarok6128
    @kantnesragnarok6128 4 หลายเดือนก่อน

    thanks mam it is very effective

  • @IzarlishAttique
    @IzarlishAttique ปีที่แล้ว

    I want to get the raw count data from profiling by array is it possible (whole numbers) ?

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว

      Array (Microarray) values can never be whole numbers because those are probe intensities. You get raw counts data (raw counts = read counts) from RNA-Seq. They are fundamentally different technologies.

    • @IzarlishAttique
      @IzarlishAttique ปีที่แล้ว

      So can we perform loess normalization on raw counts (from RNA-Seq)? Because I was trying but raw count was not normalized @@Bioinformagician

  • @kajalpanchal8239
    @kajalpanchal8239 2 ปีที่แล้ว

    you are the best!

  • @ThePharmdb
    @ThePharmdb 2 ปีที่แล้ว

    What can you use if affy package no longer exist for microarray expression data?

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      You could try oligo package provided your array is Affymetrix or NimbleGen

  • @AyrodsGamgam
    @AyrodsGamgam ปีที่แล้ว

    where to find affy package and how to install it? thanks

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว

      bioconductor.org/packages/release/bioc/html/affy.html

  • @ramarajadnya1393
    @ramarajadnya1393 9 หลายเดือนก่อน

    Thank You this helped me lot !!!,
    I have some files with TXT how to read and normalize data for txt files. example GSE74685

  • @setarehsohail5422
    @setarehsohail5422 2 ปีที่แล้ว

    Thanks very much, It is very informative video. Is it possible to continue it with analysis of this data and visualization it with heat mapping?

  • @ifeoluwaemmanuel5093
    @ifeoluwaemmanuel5093 2 ปีที่แล้ว

    Excellent video, but how can I normalise Agilent data with .txt.gz file format.

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      Check out this thread: www.biostars.org/p/388949/

  • @SangheePark-m7l
    @SangheePark-m7l ปีที่แล้ว

    Thank you very much.

  • @abdur-rehmandar8959
    @abdur-rehmandar8959 ปีที่แล้ว

    Error in getGEOSuppFiles("GSE49072") :
    Failed to download D:/R- Results/R-Results/GSE49072/GSE49072_RAW.tar!
    In addition: Warning message:
    In file.remove(destfile) :
    cannot remove file 'D:/R- Results/R-Results/GSE49072/GSE49072_RAW.tar', reason 'Permission denied'
    that is appear when i am doing this commad kindly guide me

  • @yashangchekar402
    @yashangchekar402 2 ปีที่แล้ว

    Mam there is different file instead of cel of which I am working on ....the data is not generating help me plz...

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      Can you confirm it is an Affymetrix array that you are looking at?

    • @yashangchekar402
      @yashangchekar402 2 ปีที่แล้ว

      @@Bioinformagician it's agilent

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      @@yashangchekar402 Check out this thread: www.biostars.org/p/388949/

  • @khushboobhagat3460
    @khushboobhagat3460 2 ปีที่แล้ว

    Mam , I am not able to download Data . My get GEOSuppFiles is not working. it showing me like that
    Error in getGEOSuppFiles("GSE93731") :
    Failed to download C:/Users/DELL/Desktop/R_practice/GSE93731/GSE93731_RAW.tar!
    In addition: Warning message:
    In file.remove(destfile) :
    cannot remove file 'C:/Users/DELL/Desktop/R_practice/GSE93731/GSE93731_RAW.tar', reason 'Permission denied'. Plz Suggest how i can resolve this problem.

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      Can you try running "chmod -R 755 C:/Users/DELL/Desktop/R_practice/" in the terminal (give write permissions to that folder) and try downloading it again?

    • @gsb1208
      @gsb1208 2 ปีที่แล้ว

      @@Bioinformagician I've got a similar error, running R on windows and have given all the permission to read and write files, but still the error persists, any suggestions ? Also some GEO supp files are very huge (2GB or so), can u suggest a workaround to get such data using getGEOsuppfiles

  • @tushardhyani3931
    @tushardhyani3931 2 ปีที่แล้ว

    Thank you for this video !!

  • @dwitiroy2700
    @dwitiroy2700 ปีที่แล้ว

    I have a question di .. kindly help me

  • @tulikabhardwaj484
    @tulikabhardwaj484 2 ปีที่แล้ว

    thanks you !!

  • @hajarbouamout8105
    @hajarbouamout8105 2 ปีที่แล้ว

    Hi, thank you so much for this video, i dont know why i can't install these packages even if im using the same R version ?? I have to normalize a raw.tar file and i found your interesting video thanks again

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      What system are you using? Do you see any specific error?

  • @nickgannon7466
    @nickgannon7466 2 ปีที่แล้ว

    Bravo

  • @Maryashahere
    @Maryashahere 2 ปีที่แล้ว +1

    M'am The video was really helpful. How can we combine GEO RNA micro array data with RNA seq data? I like to ask you few doubts. Can you share your mail id? Thankyou

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      You cannot combine data which have different gene quantifications units like microarray and RNA-Seq. You can find my email ID in the description section of every video.

  • @akshatamandloi5521
    @akshatamandloi5521 ปีที่แล้ว +1

    Hey, I am getting an error while downloading the supplementary file, it shows - Error in getGEOSuppFiles("GSE107462") :
    Failed to download /Users/hp/Desktop/R_DGE/GSE107462/GSE107462_RAW.tar! can you please help me with this