Mike is seriously one of my favourites. He's not old-timey so he doesn't clash with Brailsford nor Mr Heartbleed but he's still very technical and it seems like his knowledge is extremely diverse. From imagery to coding to hacking to password stuff (etc. etc.).
He's my favorite simply because the topics he talks about are stuff I'm more interested in and/or stuff I can understand better - and also I think he's very good at talking. For example I'm interested and maybe philosophically armed to understand AIs too, but I find that the guy that talks about that stuff is not as good and entertaining for some reason.
bro, ive seen so many videos of smart people try to explain concepts like this and none have come close to how simple and clear you explain it. you've clearly mastered what you are interested in. mucho respecto de los angeles
I just want to thank Mike Pound and Sean for putting this whole thing together. This has fostered creativity in computers for me and encouraged me to explore more about computers. I wouldn't really come here for a classroom, but this sort of short educational video format is really cool!
Something very similar is used in remote sensing for vegetation classification and analysis with multi-band imagery. Train the software on the color and IR signature returns of known vegetation and surface types in a portion of an image and it merrily classifies all the rest.
Dr. Mike is amazing at explaining and seems to know a whole lot. Many thanks for the code! Would be great if it was given more often so that we can easily try things at home! :)
Wow, this totally helps me understand the photoshop magic wand tool, the reason why Star Craft Brood Wars looks the way it does when it gets compressed, and so much more. The implications of this algorithm go so far in common application. Man I love computer science @_@
My god, I've been like "There isn't any computer science channels on TH-cam" for so much time, and today I just discover your channel (thanks to e-penser, a French youtuber, who presented your channel in his last video) with this video that speaks of kmeans, a clustering algorithm I used in a recent research internship for my studies. :o
Hi, I wonder if you can do a video on polygon collision algorithm? I was working on a project lately, and I was finding this issue extremely troublesome. Basically you know the vertices' coordinates of two polygons on a plane and need to check whether they collide and if so, determine their new speed and direction based on their original momentum.
If I remember correctly, k-means uses euclidean distance for finding out in which group the data points should be. Does this mean that if you are not careful, the dimension with the largest values will influence the clusters the most? How do you correct for this? Do you scale the values of the dimension dependent on its range or are there better method?
This is a point of debate and there are several approaches. You could scale the dimensions but there are also distance measures that take scales into account, such as Mahalanobis distance.
Yes you can scale all dimensions to the same range, which is a popular approach I believe. For stuff like physical data which was measured, errors can occur which are called outliers, those are usually removed before clustering as well. There are also mechanisms to automatically detect outliers.
+Chiel Single linkage clustering wouldn't avoid this problem, as it is also dependent in euclidean distance. If this distance isn't calibrated correctly, it could still be that certain variables have to big of an impact due to their large differences in values.
Great video as always! Then what happens when you don't know the number of groups you may have in the image/dataset and you need to find the most different ones between each other? What algorithm should be used in this case? I would really appreciate a video about it. Thank you!
I actually did this once because.... well... I thought it was fun. I also used the XKCD color names so it would tell me the colors, salmon and baby puke i.e. :)
I think pixels are seen as discrete points in a 3-dimensional space with axes R,G,B, I don't know if this answers your questions. So there's a map that links every pixel of the image to a 3D point. The k-mean calculation of "distances" as he drew them is performed in the space of colors, and then the result is applied to the pixels in the image.
Every pixel is a data point (one of the Xs on his chart), plotted on 3 axis, Red, Green, and Blue. The K-value is the number of groups (the little sliding cards on his charts) the algorithm clusters those data points into. No "spatial" information is used, a pixel in the top left of the image can be clustered with a pixel on the bottom right. It only cares about colour similarity.
I don't think I would have understood this video without having earlier familiarity with k-means and color quantization. Switzer explained it well, but to try to give a bit of extra help regardless: What we do with the image example is that we place each pixel's color in a three-dimensional color cube and thus get a point cloud, find the mean colors of k clusters in that cloud, and recolor the image by replacing each pixel's color with the mean color of its cluster. Well, multiple pixels can have the same color, so what we're actually doing is making a histogram rather than a point cloud, or maybe making a point cloud where each point is weighed according to how many pixels have that color. Not sure exactly what nomenclature they use in the most popular practical implementations of this. The important thing for understanding what's going on is that we throw out all spatial information when we go from pixels on the image to points in the color cube. Also, I mentioned that the color cube is three-dimensional, which it most likely is, but of course it might be in some different color space than RGB. A color space with better perceptual uniformity might give better results (or might not, I'm not too familiar with this fied).
Champagne Stegosaur To encourage TH-camrs to comment on said useless shelves; this increases the channel's user engagement ratings. It's all calculated, you cannot trust these guys.
I have a question. Why are the quantized images (the ones with less colours) are taking less space? It is because the pixels are saved with pointers (less colors => less centroids to save) and pointers take almost no space compare to a color/centroid? In the case that I described, a non-quantized image requires a color/centroid for every pixel. Thus, it would take a lot of space.
I think it's because of light reflected by the green grass nearby: compared to the trunk, those pixels are more "greener", hence the inclusion in the big "green" cluster
Daniel Grace yes, maybe.. we should take a screenshot where Dr.Pond shows the original picture and compare the two. Or maybe, because of the dispersion of the points in the RGB cube, those pixels where kind of in a border zone: just barely in the "green" category so they were classified as that
The original image is in a n x m x 3 uint8 image and the compressed kmeans solution is an n x m uint8 image. How would I extract the separate R, G, and B channels of the k-means solution image?
so if i'm understanding this right, it's an algorithm that takes two points among data, takes the mean of all the data points within a certain range of each point, moves the point to that mean data point, and then keeps doing this over and over again until... what until there are no more new means for the point to move to? and then that is the "ideal mean"?
Hi, it was indeed a nice explaination. I have one question though is does this algo work on numbers only? What if say I have thousands on images and I want to segregate those in different types.. does K-mean suits here?
Wouldn't you just weight the various things differently, so have a diagonal matrix A and use ||x|| = sqrt (xAx), with two values in A, one for space and the other for colour?
I have a question: if we got random data and we want to split them into 2 sections, do we how why the algorithm put this single datapoint p_i into section 1 or section 2 ?
Is there anyone in a 50hz region of the world willing to tell me the run time of this video? I noticed this is in 720p50. I am in a fully 60hz region and the length is 8:26 [with both the TH-cam counter and a local timer] and it feels slightly fast.
Ah, I wonder how the youtube player frame rate adjustment is made. Maybe showing every fifth frame one extra time. I don't think it is just simply displaying at 50hz because my monitor is set to a 60hz refresh rate and I don't see any harmonic/oscillation type of effects.
Do you shoot in raw format and then not do any color correction on your footage? In the future can you provide links to research papers or entry-level resources for these algorithms? That would be awesome.
Yeah. Could you tell me what I am missing there? I didn't see anything, but then again I could need new glasses. Does he do minimal editing as far as color correction goes? Thanks for the help.
Are the pixels clustered in 3D? I would like to try this in HSL colour space, in order to find dominant colors and split up an image into arbitrary channels. Has this been done before? Any links with guides on how to get started?
yes, from what I understood the pixels are indeed clustered in 3D, one for each channel (R,G,B), so it should be the same for HSL... About links, he provided a link for a MATLAB implementation, maybe you should read that and try to figure out the process.
Hi Every one i have a problem in segmentation and the problem is i want to segment the zoning map image but in our case the output classes are unknown.Any one who suggests me a segmentation model which works better for the zoning map images
Since any iterative algorithm can be transformed into a recursive one and vice versa, yes! In fact, any system that is Turing complete (en.wikipedia.org/wiki/Turing_completeness) can be transformed into any other.
Not really, calculating it is an iterative process which doesn't really need recursion. Although if you want to write a recursive function anyway you can do.
Trivially, you can rewrite any iterative algorithm with a recursive one, e.g: >def iterative(input): > DATA = input > while not should_halt(DATA): > update_data(DATA) > return DATA becomes: >def recursive(input): > DATA = input > update_data(DATA) > if should_halt(DATA): > return DATA > else: > return recursive(DATA) Which should work for any iterative algorithm (barring any mistakes I've made) but is a naive solution and generally terribly inefficient (will take up lots more memory) if the problem isn't well suited for it. Converting the other way (from recursion to iteration) is a bit more involved, but it's what your computer does every time it compiles (or interprets) a recursive program and it involves using stack frames. After all, a computer really only follows one iterative algorithm: 1: read an instruction located at P 2: perform the instruction 3: change P to be the next instruction 4: repeat from step 1. EDIT: Indentation
He is saying we are compressing image with k means and using a limited amount of colors but there will be still 3 channels and they all are integers so how the image is compressing.
First of all thanks for the video. HOW CAN WE SAY THAT RGB IMAGE IS ?-MEANS CLUSTERED IMAGE... I mean if given rgb image what is the possibility of k-means clustering value, is there any maximum and minimum limit there .... please explain...
There are only 16 colors you could ever need: Black, White, Red, Cyan, Violet, Green, Blue, Yellow, Orange, Brown, Lightred, Grey 1, Grey 2, Lightgreen, Lightblue and Grey 3! =P
If you do that, you'll have to compare the different solutions to one another, and naturally, the one with 200 clusters will have less average error than the one with 2 clusters. How to compare those? It's not trivial. It's inherently a trade-off between a precise classification (in the sense that the pixels don't change color much) and a simple classification (where you have few colors.
In general in any kind of statistical modelling or machine learning task, one has to balance between more complex models and less complex ones. The more complex you make your model, the more prone it is to overfitting, which means it is not only including relevant information, but also random noise, which is not telling much. On the other hand, if you make your model to simple it is prone to underfitting, which means, that you are loosing patterns, which otherwise would tell you more about the structure of the data. There are many ways to distinguish model complexity, but one of the simplest is to count the number of parameters (only works if all models are part of the same class). If you increase the number of parameters (or in this case the number of clusters), your model is able to capture more patterns in the data, but also becomes more likely to overfit. There are some methods to protect a bit against overfitting. One method is to partition the data, into a training and validation set, and then only run your algorithm on the validation set. Then you can test if the patterns you found are meaningful on the validation set. You can also run your algorithm multiple times with different numbers of parameters and test the largest number of parameters, which still are meaningful on the validation set. Another approach is, to penalize the algorithm for having more parameters. In that case, the model produced by an algorithm with a larger number of parameters not only has to be better than one with a lower number, it has to be at least that much better that it can make up for its larger potential to capture data patterns. That way you can increase the number of parameters until the amount the solution gets better does not increase enough anymore. There are some more approaches to the problem of detecting the number of parameters, but in general, this is where the analysis really starts to become a hard task.
because for really large data sets (e.g. multiplying n-dimensional matrix with another one and each has like 10 GB) python will compute that for weeks or moths and you have to know the python. For matlab you ned to know only math and few specific functions so you won't have to write as much code ...
As someone who doesn't do Matlab or R but was wanting to pick R (R studio looks nice), R and Matlab should be similar in terms of performance, right? R seems more programmer-oriented.
R has much slower native matrix libraries so can be much slower depending on use. You can upgrade them but it's non-trivial. There is also a lot of bad R code out there.
You're not quite making it past a book. On the page, you can actually pick any point on it, making it a continuous space. However, you can't pick library 2.55, bockcase pi, book sqrt(2). This would become quite an annoyance when an optimal position is in between books/libraries. Yet another case when books fail to impress me. #NOCtrlF
blacklistnr1 But book 2.5 = halfway into the third book. Bookshelf 2.5 = the middle of the third bookshelf. You would have to compress it all down into a continuous space, which would be a nightmare for humans to attempt to browse, but it gets the point across. To look it as a multidimensional array named library: library[][][][][]
It doesn't have to be continuous (complete) though... and you can do discrete optimization. I guess it depends on your definition of "dimension"... Vector spaces are built on top of fields, so "of course" they are complete, but somehow 2 dimensional matrices and arrays are also of 2 dimensions, but they are very discrete and very finite...
Nice explanation but found the title a bit misleading because segmentation by itself commonly refers to spatial segmentation. A better title would have said posterisation of images using k-means.
There are many many algorithms, K-means is just one, there's a free plugin for image editors called XiQuantizer that has many algorithms like neuquant, which is neural-network based. There are also simpler algorithms than K-means, after all people had to make gifs a couple of decades ago lol
Probably because in this way, the AI can learn and classify the objects and what the image is and important details. By just doing reduction, it wouldn't know as much.
In theory, it should produce the exact same results. HSB, RGB, CMYK, etc. are just ways of describing colours with numbers. Changing the colourspace doesn't effect the way the colours relate to one another, or at least it ideally shouldn't.
You may describe colors in many ways but RGB and HSL work very differently in the same operations. Interpolation between red and blue in RGB for example will give you a nice pink, however interpolation between red and blue in HSL will give you yellow, green, or cyan depending on the percentage between both extremes. Maybe you could use K-means with a color divided into RGB and HSL values to get an even closer approximate than simply 1 color model.
Edit, sorry I call fuchsia pink. Pink is kind of a horrible color because it can mean very different things depending on culture. I guess "light purple" might be a better name. Because the colors I mentioned before are between the other hues in the Hue component of the HSL model. This is why the color model you use matters when applying interpolation. Different color models are concerned with different aspects of color. So using different components to define color gives very different results when applying algorithms designed for RGB color space. If you look at hue being a rainbow circle you'll see red at 0 degrees and blue at 240 degrees. The middle 120 degrees is a green hue, while 60 degrees is yellow and 180 is cyan. Just look up the model it's easier to understand. My point is, you can't apply the algorithms designed for one color space to the components of another color space and expect them to produce the desired results. They'll be mathematically correct for the color space given but may not be correct to your perception of color mixing.
With K-Means, the representation of the data actually plays an important role. As a textbook example, just imagine two concentric rings with a large difference in diameter. While the two clusters can easily be separated by humans, k-means will fail horribly, because the means of both clusters are exactly the same. Now imagine you were to recode the data in terms of distance from the centerpoint and angle of each point. As with HSL and RGB, both the first dataset and the second dataset are describing the same data, just in different representation. However, now the two clusters are easy to separate using k-mean, it actually only needs the distance variable, which is different for each of the clusters. So while HSL, RGB and CMYK are all different ways of describing or encoding the same colors, encoding does matter a lot when K-Means is used. This actually true for most data analysis methods. A bad encoding of the data can turn a simple problem into an impossible one.
Mike is seriously one of my favourites. He's not old-timey so he doesn't clash with Brailsford nor Mr Heartbleed but he's still very technical and it seems like his knowledge is extremely diverse. From imagery to coding to hacking to password stuff (etc. etc.).
He's my favorite simply because the topics he talks about are stuff I'm more interested in and/or stuff I can understand better - and also I think he's very good at talking.
For example I'm interested and maybe philosophically armed to understand AIs too, but I find that the guy that talks about that stuff is not as good and entertaining for some reason.
I like all of the Computerphile presenters, but he's my favorite also.
*****
He's been in the Arctic lately and still is right now so for Tom to appear would be hard if they didn't pre-record it.
Agreed, his videos are so in-depth and hands-on. Which is refreshing when most other stuff is extremely high-level and abstract.
his neuralnet and learning explanations ,as a collective, blow my mind.
I love the video of Dr Mike pound, he explains well and he's entertaining :)
mike pounding our brains with knowledge
If every Prof would be like him I could have gotten my degree in half the time :D
just found this, a week of lecture done in just over 8 minutes. absolutely great explanations from Dr Mike Pound, clear and sound
I could listen to Pound for hours
bro, ive seen so many videos of smart people try to explain concepts like this and none have come close to how simple and clear you explain it. you've clearly mastered what you are interested in. mucho respecto de los angeles
I just want to thank Mike Pound and Sean for putting this whole thing together. This has fostered creativity in computers for me and encouraged me to explore more about computers.
I wouldn't really come here for a classroom, but this sort of short educational video format is really cool!
By far the best contributor to Computerphile imho. Not that the others are bad, but every video with this dude is gold.
Was looking for the explanation only but stayed around because things kept on getting interesting. Nice video.
This finally helped me understand why K-means is relevant in ML and image recognition. Thank you!
I'm still confused, but I love listening to Dr Pound! :D
You are the best. I don't know if you already have, but I really hope you go into teaching, your explanations are always crystal clear.
I haven't ever seen such an explanation about anything on the internet!
Something very similar is used in remote sensing for vegetation classification and analysis with multi-band imagery. Train the software on the color and IR signature returns of known vegetation and surface types in a portion of an image and it merrily classifies all the rest.
Dr. Mike is amazing at explaining and seems to know a whole lot. Many thanks for the code! Would be great if it was given more often so that we can easily try things at home! :)
you explained it better than my professor ever can
This was really clear, and I can see lots of uses for learning this. It makes me want to have a go at writing the k-means method myself.
Dr. Mike Pound is the best :D
You should do a separate channel for this guy
what channel do you suggest? R, G or B?
Yes
or perhaps A, I keep forgetting the poor alpha channel :(
I did watch it again because im loving his voice and i didnt understand at first because again im loving his voice.
Yes my mornings consist of scrolling through the long list of computerphile videos looking for the next one of his
Wow, this totally helps me understand the photoshop magic wand tool, the reason why Star Craft Brood Wars looks the way it does when it gets compressed, and so much more. The implications of this algorithm go so far in common application. Man I love computer science @_@
I love these image processing videos.
My god, I've been like "There isn't any computer science channels on TH-cam" for so much time, and today I just discover your channel (thanks to e-penser, a French youtuber, who presented your channel in his last video) with this video that speaks of kmeans, a clustering algorithm I used in a recent research internship for my studies. :o
Have fun binge-watching all the videos!
Hi, I wonder if you can do a video on polygon collision algorithm? I was working on a project lately, and I was finding this issue extremely troublesome. Basically you know the vertices' coordinates of two polygons on a plane and need to check whether they collide and if so, determine their new speed and direction based on their original momentum.
I like the video style/angle. Kind of like we are some third person in an interview or sth
Great Video, make more videos like this. Mike you are great.
where was this video 6 months ago when i needed to understand this properly!!!
One of the best channel
Just wow! Cool stuff, thank you. This helped me to understand it much better. Real life examples helps a lot a well!
More of this guy please! :)
Would the biggest n peaks in the fft of an image also be a good initial guess for the means?
Bump. It seems like a good way of doing it, interested to know the answer.
very interesting question !!
Gotta respect the fact that he's got a copy of Elements of Statistical Learning on the bookshelf.
Yep - super pixel segmentation would be AWESOME! Please do it! :)
Is this technique used to automaticly select objects in programs like photoshop (magic wand or something like that) ?
You might want to look at a previous video he did about edge detection ;)
They typically use a zero crossing algorithm to detect edges.
If I remember correctly, k-means uses euclidean distance for finding out in which group the data points should be.
Does this mean that if you are not careful, the dimension with the largest values will influence the clusters the most? How do you correct for this? Do you scale the values of the dimension dependent on its range or are there better method?
This is a point of debate and there are several approaches. You could scale the dimensions but there are also distance measures that take scales into account, such as Mahalanobis distance.
Yes you can scale all dimensions to the same range, which is a popular approach I believe. For stuff like physical data which was measured, errors can occur which are called outliers, those are usually removed before clustering as well. There are also mechanisms to automatically detect outliers.
+Peacemaker957 In that case wouldnt it be better to scale using the standard deviation? It dampens the effect of outliers
Would single-linkage clustering also avoid this problem?
+Chiel Single linkage clustering wouldn't avoid this problem, as it is also dependent in euclidean distance. If this distance isn't calibrated correctly, it could still be that certain variables have to big of an impact due to their large differences in values.
This guy always reminds me of one of my high school computer science teachers, but then the voice is just so different
A bit of randomness with natural selection. I think I've heard that before somewhere.
What does K mean ?
n
I see what you did there :P
Konstant
k is the lazy persons approach to "okay"
+
Best explanation ever !! Thanks a lot really
That was great concept by the way...The same question arises in my mind and explanation is quite good
Great video as always! Then what happens when you don't know the number of groups you may have in the image/dataset and you need to find the most different ones between each other? What algorithm should be used in this case? I would really appreciate a video about it. Thank you!
I actually did this once because.... well... I thought it was fun. I also used the XKCD color names so it would tell me the colors, salmon and baby puke i.e. :)
Love the comments in the matlab script
03:13
Is it possible they could get it completely wrong? Good question, yes.
I'm not sure when you started doing 50fps but thank you so so so much.
I'm having trouble understanding (in the image case), what are exactly the variables and what are the samples?
I think pixels are seen as discrete points in a 3-dimensional space with axes R,G,B, I don't know if this answers your questions.
So there's a map that links every pixel of the image to a 3D point. The k-mean calculation of "distances" as he drew them is performed in the space of colors, and then the result is applied to the pixels in the image.
x,y,R,G,B gives you 5 dimensions
Every pixel is a data point (one of the Xs on his chart), plotted on 3 axis, Red, Green, and Blue. The K-value is the number of groups (the little sliding cards on his charts) the algorithm clusters those data points into. No "spatial" information is used, a pixel in the top left of the image can be clustered with a pixel on the bottom right. It only cares about colour similarity.
I don't think I would have understood this video without having earlier familiarity with k-means and color quantization. Switzer explained it well, but to try to give a bit of extra help regardless: What we do with the image example is that we place each pixel's color in a three-dimensional color cube and thus get a point cloud, find the mean colors of k clusters in that cloud, and recolor the image by replacing each pixel's color with the mean color of its cluster.
Well, multiple pixels can have the same color, so what we're actually doing is making a histogram rather than a point cloud, or maybe making a point cloud where each point is weighed according to how many pixels have that color. Not sure exactly what nomenclature they use in the most popular practical implementations of this. The important thing for understanding what's going on is that we throw out all spatial information when we go from pixels on the image to points in the color cube.
Also, I mentioned that the color cube is three-dimensional, which it most likely is, but of course it might be in some different color space than RGB. A color space with better perceptual uniformity might give better results (or might not, I'm not too familiar with this fied).
I understood the video after reading your comment.
Amazing Explanation. Thanks a lot
What is the point in those shelves?
Champagne Stegosaur To encourage TH-camrs to comment on said useless shelves; this increases the channel's user engagement ratings. It's all calculated, you cannot trust these guys.
I have a question. Why are the quantized images (the ones with less colours) are taking less space? It is because the pixels are saved with pointers (less colors => less centroids to save) and pointers take almost no space compare to a color/centroid? In the case that I described, a non-quantized image requires a color/centroid for every pixel. Thus, it would take a lot of space.
is supersegmentation what you can use to ‚select‘ regions of an image like quick select in photoshop?
T'would be quite nice if the GNU site (or even just the Octave part) were available. Hopefully they can get the site up soon.
Why did the problem occur with the light green pixels on the tree trunk etc at 7:08?
I think it's because of light reflected by the green grass nearby: compared to the trunk, those pixels are more "greener", hence the inclusion in the big "green" cluster
So the original pixels were like beige or dull yellow or something, but the closest colour in the palette was the light green.
Daniel Grace yes, maybe.. we should take a screenshot where Dr.Pond shows the original picture and compare the two. Or maybe, because of the dispersion of the points in the RGB cube, those pixels where kind of in a border zone: just barely in the "green" category so they were classified as that
+Russell Teapot thanks I understand it now.
The original image is in a n x m x 3 uint8 image and the compressed kmeans solution is an n x m uint8 image.
How would I extract the separate R, G, and B channels of the k-means solution image?
lol, the matlab script is so simple. Dr Mike ftw.
so if i'm understanding this right, it's an algorithm that takes two points among data, takes the mean of all the data points within a certain range of each point, moves the point to that mean data point, and then keeps doing this over and over again until... what until there are no more new means for the point to move to? and then that is the "ideal mean"?
Hi,
it was indeed a nice explaination. I have one question though is does this algo work on numbers only? What if say I have thousands on images and I want to segregate those in different types.. does K-mean suits here?
This is very interesting. Could you do a video on dithering? That seems like it would fit this topic, and it's a very interesting subject.
So for spatial clustering also you just go to 5d? R g b x y? I don't know how you'ld construct a metric on that space though.
Wouldn't you just weight the various things differently, so have a diagonal matrix A and use ||x|| = sqrt (xAx), with two values in A, one for space and the other for colour?
I have a question: if we got random data and we want to split them into 2 sections, do we how why the algorithm put this single datapoint p_i into section 1 or section 2 ?
Can you also talk about other clustering algorithms like density based ones? Mean Shift and DBSCAN?
Can you do something about Image Registration and warp matrices, please?
Is there anyone in a 50hz region of the world willing to tell me the run time of this video? I noticed this is in 720p50. I am in a fully 60hz region and the length is 8:26 [with both the TH-cam counter and a local timer] and it feels slightly fast.
mmmh... I'm from Italy (which I believe it's in the PAL region, 50 hz) and it's 8:26 for me as well
Ah, I wonder how the youtube player frame rate adjustment is made. Maybe showing every fifth frame one extra time. I don't think it is just simply displaying at 50hz because my monitor is set to a 60hz refresh rate and I don't see any harmonic/oscillation type of effects.
I'm learning that for Social Web analytics for UNI.
Do you shoot in raw format and then not do any color correction on your footage?
In the future can you provide links to research papers or entry-level resources for these algorithms? That would be awesome.
Have you looked in the video description?
Yeah. Could you tell me what I am missing there? I didn't see anything, but then again I could need new glasses. Does he do minimal editing as far as color correction goes?
Thanks for the help.
I love this guy.
His bookshelf is way too empty though.
Are the pixels clustered in 3D? I would like to try this in HSL colour space, in order to find dominant colors and split up an image into arbitrary channels. Has this been done before? Any links with guides on how to get started?
yes, from what I understood the pixels are indeed clustered in 3D, one for each channel (R,G,B), so it should be the same for HSL... About links, he provided a link for a MATLAB implementation, maybe you should read that and try to figure out the process.
Thank you! I'll have a look at that link.
Hi Every one i have a problem in segmentation and the problem is i want to segment the zoning map image but in our case the output classes are unknown.Any one who suggests me a segmentation model which works better for the zoning map images
Next time, consider putting the code in a separate link. It makes the description
nicer and easier to read!
So would this be done through recursion?
Love me some Mike
Since any iterative algorithm can be transformed into a recursive one and vice versa, yes! In fact, any system that is Turing complete (en.wikipedia.org/wiki/Turing_completeness) can be transformed into any other.
Not really, calculating it is an iterative process which doesn't really need recursion. Although if you want to write a recursive function anyway you can do.
Thanks for the link man! This is a very interesting read!
Trivially, you can rewrite any iterative algorithm
with a recursive one, e.g:
>def iterative(input):
> DATA = input
> while not should_halt(DATA):
> update_data(DATA)
> return DATA
becomes:
>def recursive(input):
> DATA = input
> update_data(DATA)
> if should_halt(DATA):
> return DATA
> else:
> return recursive(DATA)
Which should work for any iterative algorithm (barring any mistakes I've made) but is a naive solution and generally terribly inefficient (will take up lots more memory) if the problem isn't well suited for it.
Converting the other way (from recursion to iteration) is a bit more involved, but it's what your computer does every time it compiles (or interprets) a recursive program and it involves using stack frames.
After all, a computer really only follows one iterative algorithm:
1: read an instruction located at P
2: perform the instruction
3: change P to be the next instruction
4: repeat from step 1.
EDIT: Indentation
Hi, do you have a video on K-means implementation in MATLAB? I need help on it please
He is saying we are compressing image with k means and using a limited amount of colors but there will be still 3 channels and they all are integers so how the image is compressing.
Just wonderful.
I love this channel.. =)
Video suggestion: SIFT Algorithm (for dummies)
This would make a good transition into SVMs.
First of all thanks for the video.
HOW CAN WE SAY THAT RGB IMAGE IS ?-MEANS CLUSTERED IMAGE... I mean if given rgb image what is the possibility of k-means clustering value, is there any maximum and minimum limit there .... please explain...
Quite entertaining. thanks a lot.
Is this how the histogram is displayed on my camera or on photoshop?
Good video, well explained.
Can you do this with OpenCV as easily as in Matlab?
There are only 16 colors you could ever need: Black, White, Red, Cyan, Violet, Green, Blue, Yellow, Orange, Brown, Lightred, Grey 1, Grey 2, Lightgreen, Lightblue and Grey 3! =P
is there a way to not specify the amount of points? like just letting the computer find out there are two groups at the most optimal state.
If you do that, you'll have to compare the different solutions to one another, and naturally, the one with 200 clusters will have less average error than the one with 2 clusters. How to compare those? It's not trivial. It's inherently a trade-off between a precise classification (in the sense that the pixels don't change color much) and a simple classification (where you have few colors.
In general in any kind of statistical modelling or machine learning task, one has to balance between more complex models and less complex ones. The more complex you make your model, the more prone it is to overfitting, which means it is not only including relevant information, but also random noise, which is not telling much. On the other hand, if you make your model to simple it is prone to underfitting, which means, that you are loosing patterns, which otherwise would tell you more about the structure of the data.
There are many ways to distinguish model complexity, but one of the simplest is to count the number of parameters (only works if all models are part of the same class). If you increase the number of parameters (or in this case the number of clusters), your model is able to capture more patterns in the data, but also becomes more likely to overfit.
There are some methods to protect a bit against overfitting. One method is to partition the data, into a training and validation set, and then only run your algorithm on the validation set. Then you can test if the patterns you found are meaningful on the validation set. You can also run your algorithm multiple times with different numbers of parameters and test the largest number of parameters, which still are meaningful on the validation set.
Another approach is, to penalize the algorithm for having more parameters. In that case, the model produced by an algorithm with a larger number of parameters not only has to be better than one with a lower number, it has to be at least that much better that it can make up for its larger potential to capture data patterns. That way you can increase the number of parameters until the amount the solution gets better does not increase enough anymore.
There are some more approaches to the problem of detecting the number of parameters, but in general, this is where the analysis really starts to become a hard task.
Can you give a explanation of Gray scale pixel segmentation? 2D?
You should open a github repository with the codes from all the videos, would make it a lot easier to get them.
or you should just try by youself
Great video!
is this what PowerPoint uses when removing the background of an image?
a year later though, GrabCut
why use matlab for anything when Python and R exist?
because for really large data sets (e.g. multiplying n-dimensional matrix with another one and each has like 10 GB) python will compute that for weeks or moths and you have to know the python. For matlab you ned to know only math and few specific functions so you won't have to write as much code ...
As someone who doesn't do Matlab or R but was wanting to pick R (R studio looks nice), R and Matlab should be similar in terms of performance, right? R seems more programmer-oriented.
R has much slower native matrix libraries so can be much slower depending on use. You can upgrade them but it's non-trivial. There is also a lot of bad R code out there.
I wish "abandon matlab" was still on the internet. was an entertaining blog. R does lose when your data is larger than your ram though.
Matlab is absolutely great.
Full disclosure: I am an engineer.
Please more from him
2D image = A page.
3D image = A book.
4D image = A bookcase.
5D image = A library.
If you NEED a visualization.
Libraries are 5D confirmed.
You're not quite making it past a book. On the page, you can actually pick any point on it, making it a continuous space. However, you can't pick library 2.55, bockcase pi, book sqrt(2). This would become quite an annoyance when an optimal position is in between books/libraries.
Yet another case when books fail to impress me. #NOCtrlF
blacklistnr1 But book 2.5 = halfway into the third book. Bookshelf 2.5 = the middle of the third bookshelf. You would have to compress it all down into a continuous space, which would be a nightmare for humans to attempt to browse, but it gets the point across. To look it as a multidimensional array named library:
library[][][][][]
It doesn't have to be continuous (complete) though... and you can do discrete optimization. I guess it depends on your definition of "dimension"... Vector spaces are built on top of fields, so "of course" they are complete, but somehow 2 dimensional matrices and arrays are also of 2 dimensions, but they are very discrete and very finite...
Nice explanation but found the title a bit misleading because segmentation by itself commonly refers to spatial segmentation. A better title would have said posterisation of images using k-means.
Yay, Mike!
could you use this to detect stego
Could one train an AI to calculate the timesignature and tempo of a music track?
How is this different than me just going into photoshop and limiting my image to n colors?
i think the result is the same, but in the matlab script you can look into how it works.
thanks. this could be used in my cross stitching projects.
There are many many algorithms, K-means is just one, there's a free plugin for image editors called XiQuantizer that has many algorithms like neuquant, which is neural-network based.
There are also simpler algorithms than K-means, after all people had to make gifs a couple of decades ago lol
Probably because in this way, the AI can learn and classify the objects and what the image is and important details. By just doing reduction, it wouldn't know as much.
thanks for the code!
Dr.M.N.BALAKRISHNA, who wants to learn matlab as well as neural network please suggests me about this fact.
Waiting for the day some naughty researcher tech names a filter a 'Cluster FCK' ☺
He needs some more books D:
The Rubicscube is solved \o/
0
Can you do video on mapreduce?
wonderful
I bet it works even better using HSL color model.
In theory, it should produce the exact same results. HSB, RGB, CMYK, etc. are just ways of describing colours with numbers. Changing the colourspace doesn't effect the way the colours relate to one another, or at least it ideally shouldn't.
You may describe colors in many ways but RGB and HSL work very differently in the same operations. Interpolation between red and blue in RGB for example will give you a nice pink, however interpolation between red and blue in HSL will give you yellow, green, or cyan depending on the percentage between both extremes. Maybe you could use K-means with a color divided into RGB and HSL values to get an even closer approximate than simply 1 color model.
Edit, sorry I call fuchsia pink. Pink is kind of a horrible color because it can mean very different things depending on culture. I guess "light purple" might be a better name.
Because the colors I mentioned before are between the other hues in the Hue component of the HSL model. This is why the color model you use matters when applying interpolation. Different color models are concerned with different aspects of color. So using different components to define color gives very different results when applying algorithms designed for RGB color space. If you look at hue being a rainbow circle you'll see red at 0 degrees and blue at 240 degrees. The middle 120 degrees is a green hue, while 60 degrees is yellow and 180 is cyan. Just look up the model it's easier to understand. My point is, you can't apply the algorithms designed for one color space to the components of another color space and expect them to produce the desired results. They'll be mathematically correct for the color space given but may not be correct to your perception of color mixing.
With K-Means, the representation of the data actually plays an important role. As a textbook example, just imagine two concentric rings with a large difference in diameter. While the two clusters can easily be separated by humans, k-means will fail horribly, because the means of both clusters are exactly the same. Now imagine you were to recode the data in terms of distance from the centerpoint and angle of each point. As with HSL and RGB, both the first dataset and the second dataset are describing the same data, just in different representation. However, now the two clusters are easy to separate using k-mean, it actually only needs the distance variable, which is different for each of the clusters.
So while HSL, RGB and CMYK are all different ways of describing or encoding the same colors, encoding does matter a lot when K-Means is used. This actually true for most data analysis methods. A bad encoding of the data can turn a simple problem into an impossible one.