Structure from Motion Octocopter - Computerphile

Computerphile

มุมมอง 56 445

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 4 ก.พ. 2016
Thanks to Audible for supporting our channel. Get a free 30 day trial at www.audible.com/Computerphile
Creating 3D models with an Octocopter, a camera and some custom software. Christian Mostegel, Research Assistant at TU Graz in Austria explains some of the technology behind the 3D Pitoti Project.
More about 3D Pitoti: bit.ly/computerphile_rockscanner
Finite State Automata: • Computers Without Memo...
CPU vs GPU: • CPU vs GPU (What's the...
3D Rock Art Scanner: • 3D Rock Art Scanner - ...
Brain Scanner: • Brain Scanner - Comput...
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

ความคิดเห็น • 137

@IDontDoDrumCovers 8 ปีที่แล้ว ⁺³⁶
the next google earth is be crazy haha, just imagine like 10,000 of these flying around cities taking pictures of everything in swarms
@GtaRockt 8 ปีที่แล้ว ⁺⁹
+Social Experiment and when they they play the ryde of the valkyres all the time
@zoranhacker 8 ปีที่แล้ว
that could happen lol
@BoJaN4464 8 ปีที่แล้ว ⁺¹
+Social Experiment Google already does this with satellite images, no drones required!
@jmac217x 8 ปีที่แล้ว
+Social Experiment That's a scary cyberpunk future you just made me imagine. I'm kind of wishing it to happen...
@frollard 8 ปีที่แล้ว ⁺¹
+Social Experiment Not sure if you've seen this with current gen google maps/earth; in many cities it's all scanned in 3d already. It's creepy since it can look below obstructed side views of houses (ie trees)
@ZadakLeader 8 ปีที่แล้ว ⁺²³
The sounds of forks and metal things in the background...
@seanski44 8 ปีที่แล้ว ⁺¹²
Actually a lot of the noise was people doing demonstrations of rock carving...
@linkVIII 8 ปีที่แล้ว
+Sean Riley sounds very restaurant like
@seanski44 8 ปีที่แล้ว ⁺²
+linkviii yeah people were clearing lunch but the tap tap tap is someone chipping away at a piece of rock
@NevaranUniverse 8 ปีที่แล้ว ⁺¹
+Vlad Ţepeş hungry developers are hungry!
@mogami4869 8 ปีที่แล้ว
I like that you are actually recommending a book in the end each time, as compared to many other channels who are sponsored by Audible and just tell me to follow their link..thank you!
@hcblue 8 ปีที่แล้ว
Such a cool device / technology. Thanks for covering both the theoretical / low-level subjects, e.g., algorithms, and more practical applications / projects, Sean!
@JohnDoe-pn9ml 8 ปีที่แล้ว ⁺²
Videos like this make me wish there was an engineerphile.
@titaniumdiveknife 8 ปีที่แล้ว ⁺¹
Genius! Wish my German and programming were half as good as his English.
@rockosigafredi 8 ปีที่แล้ว ⁺²⁵
Who's here is also from Austria? :-D
@WolframHofmeister 8 ปีที่แล้ว
AUT rules! 😁
@Anvilshock 8 ปีที่แล้ว
+rockosigafredi Schluchtenscheißer ...
@MayhemUniverse 8 ปีที่แล้ว
Servus!
@malaysiaszsz.hiphop_repres6278 6 ปีที่แล้ว
yuck ugly country shitty people
@GroovingPict 8 ปีที่แล้ว ⁺²⁹
could you maybe film in a place with even more background noise next time? Thanks. I was almost able to focus on what he was saying in this one, and we cant have that now can we.
@Computerphile 8 ปีที่แล้ว ⁺⁵²
Yeah if the world was perfect I'd film all interviews on a sound stage with a full camera crew and separate sound crew and pay the contributors so they have the chance to set up their equipment...
@GroovingPict 8 ปีที่แล้ว ⁺⁴
+Computerphile Yep, cause perfect studio conditions is definitely what I meant.
@unverifiedbiotic 8 ปีที่แล้ว ⁺¹¹
+GroovingPict Have you any idea how hard it is to find a quiet spot in a public space when doing an interview? I think that the microphone they gave the guy was doing a pretty good job all things considered. You can edit out some of the noise that stands out a lot and doesn't overlap with the audio you want to keep, but this is almost impossible if the entire audio is polluted and takes A LOT of time in each case and the end result may be even more distracting than the noise itself (audio artifacts). In the end, recording an interview and editing it is often more difficult than the viewer can imagine and you can really screw up your upload schedule if you don't teach yourself to ignore such minor issues.
@mindfulmike8612 8 ปีที่แล้ว ⁺⁶
+GroovingPict People have to work and if they're going to interview in the place where work is actually being done there is going to be background noise. Quick being such a whiny little brat and appreciate the FREE CONTENT they're giving you.
@seanski44 8 ปีที่แล้ว ⁺⁶
+GroovingPict ah so you weren't talking about a perfect world? Do you think I chose this location? Get real!
@TheVladBlog 8 ปีที่แล้ว
This is so good! We are slowly coming towards computer visual understanding. These guys have already developed a "basic" system for the AI to differentiate between different kinds of objects in it's sight.
@jmac217x 8 ปีที่แล้ว ⁺⁷
Hey Sean would you consider using a monopod stand for your videos? I know it's like the trademark to have that shaky cam, but it's a bit unsettling at times, especially when the camera is pointing a single direction most of the time anyway. I don't want to interrupt your seemingly quick workflow, but something like that could be easily maneuvered to adjust for those close up shots you get, and something with only a single leg wouldn't be too cumbersome to rotate or quickly move. Just a suggestion cause your videos rock.
@Computerphile 8 ปีที่แล้ว ⁺⁷
Will consider, I try to use tripod most of time these days, only go handheld when situation calls for it or if I haven't got access to my tripod (it happens) >Sean
@jmac217x 8 ปีที่แล้ว ⁺³
Awesome. I knew that you must have used something for those videos with Professor Brailsford, but thought I'd mention it anyway. I really love those videos, for more than just their video quality :3
@JamesJansson 8 ปีที่แล้ว
You should totally join computerphile and numberphile to cover 1) computer algebra systems and then 2) computational theorem provers.
@jopaki 8 ปีที่แล้ว
effin exciting stuff here man! wow.
@NikiDaDude 8 ปีที่แล้ว ⁺⁴
On a related topic I'd really like to see a video on the software Google use to generate the detailed topography in Maps and Earth.
If you look at most major cities you'll see that all the buildings, trees and even small objects like cars have been turned into a polygon mesh and textured.
@klaxoncow 8 ปีที่แล้ว ⁺¹
+Nick Yes, I'd love to know exactly how they've managed to do that.
It covers the entire planet (correction: planets), so it had to be an automated process.
But the textures show us the sides of objects - so it can't be coming from the satellite data alone, as satellites can only see things top down, they don't see the sides of things.
My secondary thought was that maybe they've cross-referenced it with their Street View images. The satellites see from above and Street View sees it from the ground.
But the thing is that Street View doesn't have full coverage of everywhere. Most of it is obtained by driving a camera car around, but that means it only covers what can be seen from the road. And not all roads are covered.
As you can see when you try to drop the little yellow man on the map to trigger Street View, which highlights which roads have Street View coverage, there's plenty of places where the Google car never goes - because there are no roads, or the roads are private roads, or whatever.
I thought "ah, they've cross-referenced it all with Street View and altitude data to automate a 3D map" - although, fair play, that's still one hell of a computational challenge unto itself not to be sniffed at, isn't it? - but it seems to have coverage well beyond what a satellite could ever see or Street View has coverage of.
So, how the hell did they actually do it?
And how did they do it so well? As I've not yet spotted a single mistake going around looking at this all over the planet. So it must be very good quality source data, as the automated process just ain't getting it wrong anywhere.
So, yeah, if Computerphile could find a Google Maps engineer to explain that one to us all, I'd definitely be interested to know, as they have seemingly done the impossible there!
@manhaxor 8 ปีที่แล้ว
+Nick Yasutaka Furukawa is a Google programmer who created an MVS (multi-view stereo) algorithm that's used to make the reconstructions in Maps and Earth. I would like to see a video that explains a few of the different methods of 3d reconstruction.
@unaliveeveryonenow 8 ปีที่แล้ว
+KlaxonCow Wrong, satellites can see things at slight angle, but they have fixed orbits. So at least you would need images from multiples satellites. But how would they solve the problem of tall buildings covering small objects at their bases?
@stheil 8 ปีที่แล้ว ⁺¹
+cyberconsumer I seem to recall that they don't only use satellite images (both straight down and at an angle) but also aerial photography from planes (and/or helicopters). And those can cover a much shallower angle, obviously.
@MrSabba81 7 ปีที่แล้ว
Hi thanks for sharing this. I am wondering if changing the perspective it would also be possible to use it especially for vegetation instead of excluding it: do you think we could estimate biomass (volume) of trees, shrubs, etc. with some adaptations? Thanks
Simone
@yomaze2009 8 ปีที่แล้ว
It would be interesting to see different forks in the learning algorithm that focus on detecting properties of "challenging objects" and see how quickly it "decides" to use alternate techniques to get full coverage of the object. I Imagine crowdsourcing the determination of what areas of a 3d model "need more work" by offering the 3d model alongside the video taken by the device online. Also include that ability for the "crowd" to slice into the 3d model to isolate the "least defined" component of the feature. The system could then attempt a solution and provide it back to the crowd for further analysis. Could be made to be fun as a game of sorts!
@Adamantium9001 8 ปีที่แล้ว ⁺²
How does the vehicle keep track of its own position in order to report where each picture was taken from?
@xell2k 8 ปีที่แล้ว
+Adamantium9001 it uses GPS basically. The 3D reconstruction and camera motion from the images is than automatically estimated using the structure-from-motion pipeline (without gps), and then simply "moved" to the real-world coordinates using the rough GPS coordinates or ground control points. while flying the vehicle localizes itself using GPS only
@aritakalo8011 8 ปีที่แล้ว
+Manuel Hofer GPS or reference markers. You can see the markers on the table or in the field video. Probably GPS is little bit in accurate for this job. GPS only UAVs have a tendency to drift around somewhat while trying to hover precisely, since GPS is not really pinpoint accurate.
Also since they fly under overhangs etc. it might block GPS, so they probably use those geo reference markers to get a local reliable reference.
Some UAVs do this be themselves to some extend. They have down looking optic and/or laser scanners to scan the ground below to get instant local reference for hover holding. Of course it is relative and not absolute, so hence the marker plates. From those you get absolute reference with optical scanner (aka camera)
@ozdergekko 8 ปีที่แล้ว ⁺¹
Yeah, fellow Austrian *and* Austrian institute of technology (technische Universität Graz)
@Aragorn450 8 ปีที่แล้ว
This sounds a lot like what senseFly does, but at higher resolution and with more autonomy. Their system is used for mapping cities some, plus keeping track of agriculture growth and all sorts of other things. I could see them approaching these guys to buy their technology for sure.
@yomaze2009 8 ปีที่แล้ว
Also, hardware/software side this could benefit greatly from the work being done on the analysis of the polarity of reflected light to determine the absolute color, texture, reflectivity, etc. of the individual objects.
@kensmith5694 8 ปีที่แล้ว
Add some really good magnetic sensors and in a lot of cases you could image what is hidden by the outer surface.
@EtrielDevyt 8 ปีที่แล้ว ⁺¹
This is gonna be great for location building for games!
@manhaxor 8 ปีที่แล้ว ⁺¹
+EtrielDevyt there's more efficient ways. Like a 3d scanner that takes color data as well as position data. Structure from motion is the best method for making a somewhat decently detailed reconstruction of a large area, but with flaws and distortions included.
@yourfilmindustry 8 ปีที่แล้ว
They're not rocks, they're are minerals!
@Systox25 8 ปีที่แล้ว
TU Graz? nice!
@Encypruon 8 ปีที่แล้ว
What about moving things? Like leaves, trees moving in the wind, doors, cars, animals, humans, wind turbines in the background, water and changing light conditions (clouds, the sun moving during the process, flickering synthetic light...). And what about reflective surfaces and refraction? Some surfaces don't look the same from different angles...
Can any of these things be handled reliably? I imagine it to be very hard to construct meaningful models with things like these in the scene.
@LastofAvari 8 ปีที่แล้ว
Cool stuff :)
@NeilRoy 8 ปีที่แล้ว
Fascinating idea. I wonder if such a system could be used on planetary exploration, like say Mars. Or even underwater etc... kewl stuff anyhow.
@devjock 8 ปีที่แล้ว
So this octacopter is basically doing what people with complete blindness in one eye are doing? Rocking back and forth to see what background areas of a picture are being exposed / obscured behind objects in the foreground? Yeah that takes a lot of computing power. I'd imagine the algorythms used to reconstruct 3d geometry are modeled on the way human brains work to accomplish the same task (in the case of humans, mostly based on mapping out walking surface, obstacle avoidance, and impact threat assesment). Is it done with dense neural networks? How would those be trained? Or is it a selflearning network? Something completely different?
Also, given the fact that it would be trivial to have that octacopter carry one more camera (for stereoscopic image aquisition), what was the reasoning for that not getting implemented? I'd imaging the aquisition phase would be way more streamlined if the octacopter had 3d imaging in place to begin with..
So many questions!
@hanniffydinn6019 8 ปีที่แล้ว ⁺¹
How does this compare to a laser scanner attached to a drone ??????
@Durakken 8 ปีที่แล้ว
Is the reason it has to be a static object due to processing power or some algorythmic problem? I don't see why a motion prediction algorythm couldn't be incorporated into that considering that a lot of animation is based on prediction algorythms now other than the render time for those things, but usually that is due to the quality of the render and not the animation I think.
@xell2k 8 ปีที่แล้ว
+Durakken basic Structure-from-motion only works for static scenes. You can only solve the optimization problem behind it when the 3D points "do not move". Otherwise it would get much more complicated. However, if most of the scene is static and a few objects are moving it usually works too. Then the moving things are usually just cancelled out automatically. There is of course software to reconstruct a dynamic scene, but here you usually have the assumption of a static camera, and if not, the algorithms are not ready to be used outside of the safe lab environment (as far as I know)
@RAHUDAS 4 ปีที่แล้ว
can anyone tell me what ML Algorithm used along with the Photogrammetry in this demonstration ??
@NizarElZarif 8 ปีที่แล้ว
I was wondering, does anybody knows the name of the algorithm used ? like is their is a paper to read or tutorial to watch ?
@leopoldarkham7017 8 ปีที่แล้ว ⁺¹
+Nizar El-Zarif Searching for Pointclouds and Poisson surface reconstruction will get you going in the right direction
@NizarElZarif 8 ปีที่แล้ว
Leopold Arkham Thanks
@illusivec 8 ปีที่แล้ว ⁺²
So photogrametry and agisoft photoscan?
@manhaxor 8 ปีที่แล้ว ⁺⁹
+flanker It seems like he's more selling the semi-autonomous process of the UAV deciding to take more pictures of harder to reconstruct areas.
@brummii 8 ปีที่แล้ว
+flanker I think that the important function would be the UAVs sharing information between each other and "crowdsourcing" the creation of a 3d map, which all of them can use to navigate simultaneously.
@manhaxor 8 ปีที่แล้ว
brummii That's an interesting idea. I've only ever seen realtime 3d reconstruction for single devices, like google's self driving car, and a few pieces of mining equipment. I'm sure there's more, but I have yet to see multiple devices use the same reconstruction.
@unvergebeneid 8 ปีที่แล้ว
Automatic photogrammetry ... this might be great for indie game developers! Of course it might also save lives by predicting avalanches and landslides and boring stuff like that.
@0MVR_0 8 ปีที่แล้ว
Who where the first to introduce photogrammetry to machine learning?
@TheKirkster96 8 ปีที่แล้ว
What if the intelligence could identify the position and dimensions of some vegetation (like a tree) and then just generate a model to fill in that space and give the viewers a representation of "hey there is some tree there, but we can't scan each branch and every leaf into a accurate model."
@THEMATT222 8 หลายเดือนก่อน
Noice 👌
@frigeragmady9625 8 ปีที่แล้ว ⁺¹
blaming background noise, distorting the vocals of an intellectual while he speaks intelligent-stuff (i am dumbing myself down to level with some of these commenters) is kinda like not wanting to blame yourself for not understanding the intelligent-stuff
@unaimb 8 ปีที่แล้ว
That’s photogrammetry with dense pointcloud Poisson mesh reconstruction, it’s been used for matchmoving in the film and TV industry for about 6-7 years now. It seems interesting that they chose to make the whole software from scratch instead of just the drone driving bit… I guess most matchmoving software is not really open when trying to expand its capabilities in such a way.
@xell2k 8 ปีที่แล้ว ⁺¹
+Unai Martínez Barredo Exactly, there are a lot of SfM pipelines out there (freeware and non-freeware) that work basically in the same way. However, many of them are closed-source which makes it uncomfortable to extend them. since we are a research institution, we aim at developing new tools, such as the view planning you see in the video. Our basic 3D reconstruction software has been developed since about 2006 or 2007, and we are basically maintaining it and extending it with further apps and functionalities. And since we have the full source code, this is ideal for research
@unaimb 8 ปีที่แล้ว
Nice :)
@espalorp3286 8 ปีที่แล้ว ⁺²
enemy uav spotted
@iseslc 8 ปีที่แล้ว
+Proteus Battlefield player spotted
@tisimo123 8 ปีที่แล้ว ⁺¹
+iseslc or any call of duty game after 4
@iseslc 8 ปีที่แล้ว
tisimo123
That's right! I haven't played those, though... only BF games.
@chris24hdez 8 ปีที่แล้ว
It's called 3D Photogrammetry
@serkantan2951 4 ปีที่แล้ว
3D reconstruction I believe would be a broader definition. Aside from mere photographs, there are other ways to measure the depth in an image even though they don't really discuss those methods they probably taking them into account such as shades, illumination, defocus, texture.
8 ปีที่แล้ว ⁺¹
Photogrametry is great for game development, too.
@gen157 8 ปีที่แล้ว
Early, but not first. Not that I care, just wanted to tell others about it.
About the video: Non-English speaker speaking English makes it a little hard when he isn't fluent enough. I understood enough, but needs to understand sentence structure a little bit.
@cavalrycome 8 ปีที่แล้ว ⁺⁷
+Gen15
I can see why some viewers might have trouble with the speaker's accent, but his grammar is very close to perfect.
@JapTut 8 ปีที่แล้ว ⁺²
+Gen15 and the background noise doesn't help either.
@NeatNit 8 ปีที่แล้ว ⁺⁵
Is it really impossible for you to drop the Audible sponsoring? It's getting really annoying.
@quakquak6141 8 ปีที่แล้ว ⁺²⁰
+NeatNit it's at the end of the video, less annoying than this is impossible, I don't see the problem
@NeatNit 8 ปีที่แล้ว ⁺²
+quak quak I guess you're right... It just kinda sickens me that they have to pretend to be excited and genuinely interested in audible when it's obvious that they're paid to say that.
@trucid2 8 ปีที่แล้ว ⁺¹¹
+NeatNit It bothers you when others make money from their work? Should they be working for free for your sake?
@NeatNit 8 ปีที่แล้ว ⁺²
+trucid2 not what I meant, see my previous reply
@AustrianAnarchy 8 ปีที่แล้ว ⁺³
+NeatNit Maybe they are genuinely excited about the Audible anyway and the sponsorship made them positively giddy?
@Hwyadylaw 8 ปีที่แล้ว ⁺¹
This is just me being a language geek, and not relevant to the video, but I feel like the ones with four rotors should be called Tetracopter instead of Quadrocopter.
@Ptolemusa 8 ปีที่แล้ว
+McDucky Indeed. ^^
@TheAllardP 8 ปีที่แล้ว ⁺²
+McDucky
It's the same thing. Quad is latin and Tetrad is greek.
@Hwyadylaw 8 ปีที่แล้ว ⁺¹
Philippe Allard
I wouldn't call myself a "language geek" if I didn't know that.
It's Téttrares (τέτταρες) or Téssares(τέσσαρες) in Ancient Greek. It's Quattuor in Latin.
@jazzpi 8 ปีที่แล้ว
+McDucky Why?
@antivanti 8 ปีที่แล้ว ⁺³
+McDucky So your preference for Tetracopter stems not from knowledge of languages but for an arbitrary preference for Greek over Latin? =)
@TomMinnick 8 ปีที่แล้ว
I've heard it called "Photogrammetry" before, but this is the first time I've heard it called "3d reconstruction"
@GroovingPict 8 ปีที่แล้ว ⁺¹
"take this images"...
@IstasPumaNevada 8 ปีที่แล้ว ⁺¹
+GroovingPict This comment coupled with your other one is pinging my troll-detection algorithms.
@Anvilshock 8 ปีที่แล้ว
+GroovingPict If you do the equivalent of squinting with your ears really hard, you can almost hear he actually says "these" really fast ...
@Anvilshock 8 ปีที่แล้ว ⁺¹
GroovingPict Yep, it's the dumbest thing, and - Congratulations - YOU got it! Here's your Golden Dunce Hat prize!
@ivesennightfall6779 8 ปีที่แล้ว
I saw Ubuntu /o/
@Diggnuts 8 ปีที่แล้ว ⁺⁹
I hate the term "structure from motion". It makes no sense.
@manhaxor 8 ปีที่แล้ว
+Diggnuts Well it's usually used to reconstruct a scene from video. I agree that it's strange to hear it used when the source material is still images.
@antivanti 8 ปีที่แล้ว
+Diggnuts "Parallax 3D reconstruction" might be a more correct term? Or just photogrammetry.
@Diggnuts 8 ปีที่แล้ว ⁺³
Anders Öhlund I prefer photogrammetry as that is quite simply precisely what it is!
@jmac217x 8 ปีที่แล้ว
As soon as he said _intelligence_ i cringed a little. I don't want that to become the next buzz word. An algorithm like that does not equate to intelligence in my opinion. Everything about this guy seems off to me.

ต่อไป

เล่นอัตโนมัติ

Starship Flight 4 Update // Giant Stars Disappearing // Volcanoes on Venus