- 189
- 135 561
AnalytiCode
United States
เข้าร่วมเมื่อ 20 ก.พ. 2014
Welcome to the Ultimate Hub for Analytical Data Science & Measurement Science!
Howdy! I’m Chris Pulliam, a PhD Measurement Scientist with a passion for innovating at the intersection of measurement and data science. On this channel, I combine my 10+ years in the lab with expertise in data science and teaching to bring you:
- Signal processing tutorials
- Machine learning techniques
- Scientific data visualization
- Near IR and Mass Spectrometry data interpretation
- Analytical chemistry, and more!
Whether you're diving into data or just data curious, there’s a video or playlist here for you!
Disclaimer: Opinions are my own, not my employers.
Visit my Medium Blog to see many of these ideas written out: medium.com/@chrisjpulliam
If you want to chat find me on LinkedIn: www.linkedin.com/in/chrispulliam3/
Howdy! I’m Chris Pulliam, a PhD Measurement Scientist with a passion for innovating at the intersection of measurement and data science. On this channel, I combine my 10+ years in the lab with expertise in data science and teaching to bring you:
- Signal processing tutorials
- Machine learning techniques
- Scientific data visualization
- Near IR and Mass Spectrometry data interpretation
- Analytical chemistry, and more!
Whether you're diving into data or just data curious, there’s a video or playlist here for you!
Disclaimer: Opinions are my own, not my employers.
Visit my Medium Blog to see many of these ideas written out: medium.com/@chrisjpulliam
If you want to chat find me on LinkedIn: www.linkedin.com/in/chrispulliam3/
The Truth About Standard Scaler: Can It Correct Skewed Data?
A viewer asked if z-scaling data can remove skew. In this video, we’ll explore whether that’s possible and dive into the details to find out. Let’s put it to the test!
0:00 Intro | Can we use Standard Scaler?
0:27 Getting into notebook
0:50 Scaling only
1:31 Combining log transform and Standard Scaler
2:42 Closing
0:00 Intro | Can we use Standard Scaler?
0:27 Getting into notebook
0:50 Scaling only
1:31 Combining log transform and Standard Scaler
2:42 Closing
มุมมอง: 51
วีดีโอ
Boost Performance with Python Feature Selection with NIRS data
มุมมอง 202วันที่ผ่านมา
Struggling with too many features in your dataset? In this video, I’ll show you how to tackle feature selection using the sklearn library's SelectKBest method with f_classif for statistical testing. I’ll walk you through the process step by step, using Near IR plastic data to demonstrate how you can effectively reduce your feature set and improve your model's performance. Whether you're new to ...
How to Boost Plastic Detection Accuracy with NIR Spectroscopy!
มุมมอง 10014 วันที่ผ่านมา
In this video, I’ll walk you through how to optimize a machine learning model to detect different types of plastics using Near-Infrared (NIR) spectroscopy data. By using GridSearchCV, we’ll fine-tune hyperparameters for the best model performance, removing the guesswork from model building. You'll learn step-by-step how this powerful tool can improve classification accuracy for NIR spectral dat...
Build Better Models with PCA | NIRS Plastic Prediction!
มุมมอง 12421 วันที่ผ่านมา
Did you know that Principal Component Analysis (PCA) can be a powerful preprocessing step in machine learning? PCA simplifies your data by reducing the number of variables, making it especially useful when you have a small sample size compared to the number of features. In this video, we'll demonstrate how to apply PCA to Near IR plastic data, followed by using a Support Vector Classifier (SVC)...
Let's explore total dissolved solids in the LA River!
มุมมอง 3921 วันที่ผ่านมา
Science can be performed anywhere if you can the right equipment. In this video I spend time at the LA River measuring the total dissolved solids using a "TDS" probe! Check it out
How to Filter Near IR Data!
มุมมอง 120หลายเดือนก่อน
Removing data is always a touchy subject but sometimes there are technical reasons why it may be beneficial such as data quality. In the video I demonstrate how to use data quality to filter Near IR data using various python data analysis tools.
We are back!
มุมมอง 58หลายเดือนก่อน
Welcome back from summer hours! I'll resume my regular posting schedule. This week we'll continue on our journey of signal processing but we'll be applying it to a real NIRS data set!
Evaluate Spectroscopy Signal with Percent RSD!
มุมมอง 136หลายเดือนก่อน
In this video I will demonstrate how to use python to evaluate analytical signal using relative standard deviation.
Evaluate Sample Replicates with Pearson Correlation and Python!
มุมมอง 1602 หลายเดือนก่อน
Discover how to use Pearson correlation and Python to evaluate sample replicates in this tutorial! Using a portable near IR spectrometer from TrinamiX, I'll walk you through the entire process, making complex data analysis simple and accessible. Perfect for scientists, data analysts, and anyone interested in spectroscopy and data science. Don't miss out on this in-depth guide!
Boosting Analytical Data with Derivative Signal Processing!
มุมมอง 1672 หลายเดือนก่อน
Boosting Analytical Data with Derivative Signal Processing!
Analyzing Vacuum Soil and Dryer Lint with Near IR Spectroscopy!
มุมมอง 1803 หลายเดือนก่อน
Analyzing Vacuum Soil and Dryer Lint with Near IR Spectroscopy!
Boosting Job Performance with Peloton: My Success Story
มุมมอง 633 หลายเดือนก่อน
Boosting Job Performance with Peloton: My Success Story
Catch Lemons BEFORE THEY ROT with Near IR!
มุมมอง 1313 หลายเดือนก่อน
Catch Lemons BEFORE THEY ROT with Near IR!
Chemical Analysis at Home: Analyzing Plastic Containers!
มุมมอง 1843 หลายเดือนก่อน
Chemical Analysis at Home: Analyzing Plastic Containers!
NIRS Analysis and Python Data Insights of Dryer Sheets!
มุมมอง 1044 หลายเดือนก่อน
NIRS Analysis and Python Data Insights of Dryer Sheets!
TrinamiX Near IR Spectroscopy Demonstration! | Anomaly Detection
มุมมอง 2904 หลายเดือนก่อน
TrinamiX Near IR Spectroscopy Demonstration! | Anomaly Detection
Advanced Python Signal Processing for Pharmaceutical Data!
มุมมอง 4314 หลายเดือนก่อน
Advanced Python Signal Processing for Pharmaceutical Data!
Cleaning a River with Science | Data Vlog!
มุมมอง 1084 หลายเดือนก่อน
Cleaning a River with Science | Data Vlog!
VLOG! My Journey to develop range! USC, LA River, and Chemical Analysis! 🧪
มุมมอง 744 หลายเดือนก่อน
VLOG! My Journey to develop range! USC, LA River, and Chemical Analysis! 🧪
Exploring the TrinamiX Near IR Spectrometer: Software, Features, and Data!
มุมมอง 4645 หลายเดือนก่อน
Exploring the TrinamiX Near IR Spectrometer: Software, Features, and Data!
We Are Moving Towards Fieldable Measurements! 🥳
มุมมอง 1175 หลายเดือนก่อน
We Are Moving Towards Fieldable Measurements! 🥳
Create Jointplots EASILY with Seaborn!
มุมมอง 2485 หลายเดือนก่อน
Create Jointplots EASILY with Seaborn!
Refactoring Data with Pandas: Mastering GroupBy and QCut for Efficient Analysis
มุมมอง 3506 หลายเดือนก่อน
Refactoring Data with Pandas: Mastering GroupBy and QCut for Efficient Analysis
Nippy Signal Processing Tutorial for Near IR Spectroscopy
มุมมอง 1776 หลายเดือนก่อน
Nippy Signal Processing Tutorial for Near IR Spectroscopy
Accelerate your NIRS data processing with Nippy! | PT1
มุมมอง 1406 หลายเดือนก่อน
Accelerate your NIRS data processing with Nippy! | PT1
Pandas Plotting Bootstrapping Method | Tutorial
มุมมอง 1526 หลายเดือนก่อน
Pandas Plotting Bootstrapping Method | Tutorial
Pandas Plotting Scatter Matrix! | Tutorial
มุมมอง 1876 หลายเดือนก่อน
Pandas Plotting Scatter Matrix! | Tutorial
Schedule your data processing! | Jupyter Extension Tutorial
มุมมอง 2537 หลายเดือนก่อน
Schedule your data processing! | Jupyter Extension Tutorial
Lets Solve Physical Chemistry Questions with Scipy Constants! | Tutorial
มุมมอง 1667 หลายเดือนก่อน
Lets Solve Physical Chemistry Questions with Scipy Constants! | Tutorial
Thanks 👍
My pleasure!
great video but please turn down the music
No problem!! Thanks for the feedback
You're doing great, please avoid bg music while explaining. Thank you.
Thanks for the feedback! Glad you enjoyed the video
I have a question about picking the best features in a huge dataset. If I use a method like F.S, it's going to take forever. What other ways can I use besides PCA for datasets with over a million rows? Could I just pick a random part of the data and use normal methods? Thanks for the videos, they've been a big help.
Howdy, can you say more about the source of the data? If the data is randomly distributed then you definitely can randomly sample from the larger set.
Random add, you should consider polars instead of a pandas dataframe. Likely gonna be much more efficient.
Love this agreed she is going to be a great scientist and now Kofi wants to help too!
Let’s see what we can do!!
Where are you from bro?
lol I live in California but raised in Midwest- is that what you mean?
Awesome!
Thanks boss!!
Way more exciting videos coming soon!! Nerd 🚨
thank you!
My pleasure!!
Very helpful thank you! Im new to data analysis and multiindex is really tough for me, this really helped though!
Thank you! Enjoy the journey!!
Hi, thanks for uploading these gems when literaly no one is doing these on youtube. Btw from where i can find the data you used here?
Howdy I’ll upload the data to GitHub! When I do I’ll add the link to this comment :)
Enjoy your day!!
Thank you very much
My pleasure
How can I modify my jpynb to look like exactly like yours? I don't see a tab like yours while constructing classes. Help!!
Howdy, do you mean dark mode in Jupyter?
If I understand you are using correlation calculations to understand variability in measurement repeatability? What might cause a difference? The equipment or measurement process?
Turns out it was due to multiple factors.. Some were multi-colored and they gave rise to detectable differences (as I sample multiple parts of a container), sometimes I didn’t make full contact (human error), and a few other trends. The instrument itself was fantastic. I’ll make a video going over the samples with poor reproducibility soon!
Very informative and helpful video, thank you for sharing it. Chris, if one wanted to also link the raw data of a well in addition to the sample name , how would you approach this? Let's say column one in the excel sheet contains sample information as in your example and column 2 contains the raw data values for each well act... In your video when you executed " map_df" and it created a dataframe , I would like it also to contain the raw values located in column 2 of the excel sheet. I tried to alter your script but was not successful in making this minor edit. Any advise would be greatly appreciated
Great question and definitely relevant. Let me see how I can modify the code. When I rewatched the video to refresh myself, I noticed the use of a restrictive dictionary really limits the flexibility of the script. Let me refactor to generalize better! Thanks for watching and commenting :)
Sweet! I just found your channel a couple weeks ago. Your content is amazingly clear and useful. Either nobody's doing NIR, or they're missing this goldmine! Keep it up :) It's very much appreciated!! (I'm actually applying all this stuff to Raman spectra, which there are even fewer tutorials for.... regardless, your videos have been literally perfectly exactly what I need anyway!!)
That’s awesome to hear! If there’s anything you’d like to see don’t hesitate to let me know! Dude, Raman is such a great technique
Bro, I'm so proud of you! Thank you for this content :D
Thanks boss! 🙏🏽
Thanks for this! I did a video on this several years ago on my channel, but I loved how quickly you demonstrated the power of the cv, within the first few seconds of your video.
Thank you!! You’re a creator too!?
I just checked your channel, we should collab!
Hi Christopher , thank you for this video, I need your email pls, i have some issues in my data, thank you
Interesting, what kind of data?
@@CJP3 Hyperspectral data , we can cooperate
Really good one!
Thanks boss!
Very good video. Thank you.
Thank you very much! 🙏
"First" Love your channel, we should chat one day!
Thank you!
Love your videos!
Thanks boss!
the audio is so low, man
Sorry, the audio in newer videos is better!
I have just found your signal processing series. Excellent. I have a question that I hope you can shed some light one. I use SciPy fft, fftfreq, and find_peaks, and have found that the amplitudes of the peaks found are hugely different (i.e. fft amplitude of a peak is 1200 and find_peaks has an amplitude of 0.65. Any insight would be appreciated. Thanks.
Howdy! Thanks for the comment and great question! These differences stem from how each function processes and scales the data. You can achieve the same scale by normalizing the data.
Hahaha go for it buddy😅
One day!!
Thank you
Hello Christopher, Thank you for teaching me python. I am a PhD student in food science, mainly work with FTIR and GC-MS data. I am a three-month years old in python :). And struggling to manage my FTIR/GC-MS data (I work with a lot of data). It would very helpful for me (hope for others also ) if you make in-depth guides: 1. How to manage/organize data (FTIR, GC-MS- which format is suitable for FTIR/GC-MS data storage) 2. Maybe a GC-MS data analysis project that you would demonstrate how to import data, how to clean/ how to visualize/ how to find peaks (which you already did thought), and so on. I mean a complete real case study. 3. Covering Chemometrics would be amazing which is hard to find in youtube. 4. Random forest/PLS/PLS-D/ASCA? 5. Plot/Scatter plot for publication? I am just giving ideas it doesn't mean you should do every thing in this list. You already have created the content which is helping me in my PhD journey! Good luck!
All great suggestions! The hardest part is USUALLY finding open GC-MS data. Once I find some I’ll begin putting some content. If you know of any, don’t hesitate to let me know :) all the best on your PhD journey!! Enjoy the ride
Do you know Poker test or run test? These tests use for controling randomization.
I typically just seed my randomizer. I’ll look into poker test and run test! Thanks for letting me know
Subscribed👍 succinct, informative.
Thanks boss!
Finally some animals! I'm stunned.
Legit, me too!! I had to go out a little later in the afternoon/early evening
I stumbled onto your videos. Feels like I stumbled into a new world. I was searching for training data for the purpose of training up a large language model on organic chemistry. Started watching your videos and realised how limited my thinking is/was. Thanx a zillion.
That’s awesome! I’d love to hear more about your project endeavors at some point. Have a great day!
First time I have seen your videos. This is genuinely a very good video. Very well explained and clear. I am subscribing. The music wasn’t off putting either!
Thank you so much!!! I really appreciate it. If there’s anything you’d like to see just let me know!
@@CJP3Sent you an invite on LinkedIn!
So what about if we were to standardize using z-scoring? It seems like that would get largely the same impact, wouldn't it?
Howdy, Z-scaling won’t improve the skew. The data will be mean-centered but will carry the non-uniform distribution
@@CJP3 that explains it. Thanks!
@@undertaker7523 I think I’ll make a video that graphically illustrates this point. Thanks for asking :)
@@CJP3 yes that would be amazing! Thank you!
Chris😂, some of your videos are awesome, i am still looking at derivatives you did recently. As far as garbage by the watter: police department has to let bums through to spend time somewhere. Bums usually bring through away to these kinds of places, the shopping cart means they walked it from the parking lot of target 🎯 with their bags.
🤦🏽♂️
I appreciate the answer, despite really disliking it!
The story is people and people don't deserve the ground they walk on...
It’s kind of sad to see so much debris that doesn’t belong!
@@CJP3 again... it's a people thing
Great video man, it’s hard to come by something so easy to follow, keep it up!
Thanks bro!!
Have a day with a ☕️
TRUMP 2024
We’ll see what happens in November 😂
Now take a walk until you find wildlife. Is that a thing in the city?
We have small amounts of wildlife. That’s a great idea!💡
I should start counting the steps! 1,2, Welp…
Great video! What I found out that savgol as second derivative does not calculate boundaries of wavelengths. So, I got one wavelength point less. But, when I applied np.gradient as 2nd derivative, I got desired full wavelength range. Probably, from your experience, you can explain the difference between savgol and np.gradient? And dummy question, does it make sense calculate third order derivate? 😅
Great question! I’ll dig into np.gradient! Thanks for point that out!
Howdy!! Here is the data discussed in the video :) github.com/chrisp33/Analytical_YT_Tutorials/blob/main/nir_coffee.csv
Very helpful video! :) I'm a beginner in Python and learning pandas by doing a personal project. For sure, this method is excellent to keep the code organized. Thank you so much for explaining how it works!
My pleasure! 🙏🏽
We got lots of that in Colorado
I would love to visit! In LA can be hard to find 😂
First of all, I would like to thank you for your explanation.. I have some questions for this method. My experimental data has 1600 information about modal analysis ( frequencies Vs Magnitudes ) So, how can identify the most good peaks with this PPM? (I'm not good in python)
Hi Somaya! a couple questions to guide the answer. How are you defining “good peaks” tallest peaks? Best resolved? What is the nature of the data? Scipy has a lot of tools for what you’re generally describing but which specific combination of methods will depend on the nature of the data and the questions above. Happy to help more!
Explaining complex subjects with your calm, composed, deep voice paired with the slow jazz music background of your choosing make me feel like I am in a private tutoring session with you in an empty bar. I am all for it! Thank you!
My pleasure! Thank you for the kind words!