AR Sudoku Solver in Your Browser: TensorFlow & Image Processing Magic

atomic14

มุมมอง 14 186

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 14 มิ.ย. 2024
Augmented reality running in your browser - no app required! By combining some simple image processing algorithms and machine learning we can create something pretty cool.
If you found this video interesting please hit the subscribe button on the channel - there will be follow up videos on more machine learning topics.
A long time ago I wrote an app for the iPhone that let you take a grab a sudoku puzzle using your iPhone's camera.
Recently when I was investigating my self-organising Christmas lights project I realised that the browser APIs and ecosystem had advanced to the point that it was probably possible to recreate the system running purely in the browser.
Self-organising lights: • Magic LEDs: Self-Organ...
Things like TensorFlow and TensorFlow.js make building the digit recognized straightforward.
As you can see it works pretty well - you can try it out for yourself here: sudoku.cmgresearch.com/
And of course, all the code is in GitHub: github.com/atomic14/ar-browse...
Hopefully, this video will give you a good idea of how the system works and the thinking behind what I've done.
We're taking a feed from the camera on the device. This comes into us as an RGB image. We're not really interested in colour as we're working with printed puzzles which will typically be printed in black and white.
So our first step is to convert from RGB to greyscale.
Convert to greyscale: 01:58
We're using morphological operations for locating the puzzle - typically these work on black and white binary images, so our next step is to binarise our image.
Thresholding: 02:33
Next, we need to identify the blob that is the puzzle and work out the coordinates of each corner of the puzzle grid.
Locating the puzzle: 03:42
Using these four corners we can compute a tomography between our camera image and an "ideal" image of the puzzle.
Puzzle Extraction: 05:17
You can see more details on the algorithm used for this here: www.cse.psu.edu/~rtc12/CSE486/...
Once we've got the square puzzle image we need to extract the contents of each individual cell. We examine the connected region inside the box and use the bounds of this to extract an image of the digit.
Digit Extraction: 06:34
We now run a neural network that has been trained by TensorFlow using TensorFlow.js. The network is trained in an interactive Jupiter notebook available at the GitHub link.
Training the neural network: 7:12
To solve the puzzle we use Donald Knuth's Dancing Links and Algorithm X - en.wikipedia.org/wiki/Knuth%2...
To do this we encode the puzzle as an exact cover problem.
Solving the puzzle: 11:44
Finally, we can display the results back on top of the camera feed to give us our Augmented Reality display.
Displaying the results: 15:36
I hope you've enjoyed this video - please hit the subscribe button and leave any thoughts you might have in the comments.
---
Want to help support the channel? I'm accepting coffee on ko-fi.com/atomic14
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 43

@masteronepiece6559 3 ปีที่แล้ว ⁺³
Best Description section for a video on TH-cam. Great work and funny to do.
@TheRealKitWalker 3 ปีที่แล้ว ⁺²
That demo at the beginning was soo sooo impressive 😍👏👏🍫🍫🍫
@Ogottm 3 ปีที่แล้ว ⁺¹
Really cool and impressive! Thank you for the video!
@imadlakehal2103 3 ปีที่แล้ว ⁺¹¹
So Sick! Thanks for this awesome presentation .. well documented in a clear way, I ll give 10/10 💪
@atomic14 3 ปีที่แล้ว ⁺²
Much appreciated!
@imadlakehal2103 3 ปีที่แล้ว
@@atomic14 comparing to what is shown on the video at *7:52* there are many elements missing in the repo like image_to_font_map.txt
@atomic14 3 ปีที่แล้ว ⁺²
@@imadlakehal2103 I'll take a look.
@atomic14 3 ปีที่แล้ว ⁺²
The files should be in the "tensorflow" subdirectory. image_to_font_map.txt is there though actually, it shouldn't have been committed as it is generated by the code in "generate_training_data.ipynb". Before running the code in that notebook you will need to expand the zip files in the "fonts" folder.
I've updated the README to add some more instructions.
Let me know if you have any problems.
@imadlakehal2103 3 ปีที่แล้ว
@@atomic14 What about the folder *lib/* is it also generated after a script execution? I suppose that *log/* will be generated after running tests. In addition the folder *all_data/* is supposed to be filled directly with pictures, whatever the structure inside?
@evgeniiaveselova100 ปีที่แล้ว
Thank you very much - this project can literally be an AR beginner tutorial
@123aniruddhsiddh 3 ปีที่แล้ว ⁺¹
Looks amazing
@enudemejonathan4057 3 ปีที่แล้ว ⁺¹
Chai(a wow expression in pidgin English) ! This is so smart. I'm def doing this.
@meassinal 2 ปีที่แล้ว
Already hit subscribe :) Really awesome for the tactics and I can't wait to give it a try.
@FF7824 3 ปีที่แล้ว ⁺¹
Fantastic explanation and design!
@atomic14 3 ปีที่แล้ว
Thanks you!
@justins4996 2 ปีที่แล้ว
Fantastic video. Keep making these
@NAVAP_IAS 3 ปีที่แล้ว ⁺²
Really awesome. Thank you🤝🤝🤝
@atomic14 3 ปีที่แล้ว
Thanks!
@valdezm_com 3 ปีที่แล้ว ⁺¹
You are awesome and this is inspiring! Thanks! Time to start learning. -computer scientist with no AI background
@atomic14 3 ปีที่แล้ว
There are some great examples out there - and a lot of good training resources. It's great fun!
@somnathpaul2020 3 ปีที่แล้ว ⁺²
I found 3 dislikes!! What did they expect not sure... good to know!!... Anyway, I just subscribed and would expect to learn a lot from you in the future...Thank you so much, sir!!
@atomic14 3 ปีที่แล้ว
Thanks for the sub!
@raguaviva 3 ปีที่แล้ว ⁺¹
thanks!
@atomic14 3 ปีที่แล้ว
No problem!
@paullouppe9947 3 ปีที่แล้ว
But . This is just awesome !! I want to do one myself for suguru!
@atomic14 3 ปีที่แล้ว ⁺¹
Definitely possible - you'll need to change the code to handle the number of boxes in the Suguru puzzle and update the solver.
@mohd.parvez2719 3 ปีที่แล้ว
Awesome presentation. I am also working on sudoku solver android app but my model accuracy isn't so good. Will you please share your dataset which you used for training your model and give me some suggestion how to perform image processing task. Thank you.🙂🙂
@atomic14 3 ปีที่แล้ว ⁺¹
Hi there, you can find the fully trained model in the GitHub repo and the training data can be generated using the python notebooks. Feel free to re-use the code as you need. If you get stuck then please raise an issue on the repo and I'll see if I can help.
@brycem.3450 7 หลายเดือนก่อน
Could this be applied to Brilliant Labs Monocle?
@JasonMayes 3 ปีที่แล้ว ⁺⁴
Hey there! This is Jason from the TensorFlow.js team here at Google - would love to talk to you about this for an upcoming show and tell session if you would be interested in showing your work to the TFJS community?
@atomic14 3 ปีที่แล้ว ⁺²
Sure, that sounds great - you can contact me via the channel page or chris _at_ cmgresearch _dot_ com
@JasonMayes 3 ปีที่แล้ว
@@atomic14 Perfect will drop you an email sometime!
@ysimonx 3 ปีที่แล้ว ⁺¹
thank you for sharing this awesome video !! (my comment will be flagged as "positive" by TF)
@user-jr4if2rm4d 2 ปีที่แล้ว
Request code that can be run on jupytor notebook
@Shakespeare1612 3 ปีที่แล้ว
Great, great stuff! But, this is just a silly game. What if we applied this to solving Maths worksheets? = Homework DONE! What if we could get it to balance chemical formulas? = material science BOOSTED! Or... if you want to stick with games, my mother always loved word search puzzles. This could, with modification, be used to super-impose the capsules onto a word search puzzle, also solve cross word puzzles. Could it be taught to recognize positions on a chess board? Yes, I think it could.
@atomic14 3 ปีที่แล้ว
There's definitely a chess cheat program in their somewhere :) I do sometimes wonder what the future offers us. Augmented glasses/contact lenses will change us in a very fundamental way. From there it's a very small step to implants before you know it - we'll all be borgs...
@djtomoy 11 หลายเดือนก่อน
Make one that can do the crossword
@Khujandiho 3 ปีที่แล้ว ⁺¹
:); ridiculous.
@ChadFaragher 3 ปีที่แล้ว
Forgive me for teasing you, but it's funny that you can do all this technical wizardry yet you keep mispronouncing sudoku as "suduko".
@atomic14 3 ปีที่แล้ว ⁺¹
Haha, not problem, I can never remember how it's supposed to be pronounced :)
@fernandobravo650 ปีที่แล้ว
Big deal

ต่อไป

เล่นอัตโนมัติ

Magic LEDs: Self-Organizing with ESP32 CAM & Simple Image Processing!