- 21
- 8 110
Deep Learning Daily
United States
เข้าร่วมเมื่อ 23 ธ.ค. 2023
Embark on a daily discovery of the frontiers of artificial intelligence with Deep Learning Daily, your dedicated source for insights into machine learning, neural networks, and the power of AI. Our channel is an extension of the #DeepLearningDaily newsletter, delivering the same high-quality, educational content with an engaging twist. Dive into tutorials, expert interviews, and thought-provoking discussions designed to bring deep learning to life. Whether you're an industry professional, a student of the sciences, or simply AI-curious, Deep Learning Daily is your daily dose of AI enlightenment. Subscribe to join our community of forward-thinkers, innovators, and lifelong learners shaping the future of technology.
Uncovering the Power Consumption of Large Language Models
Why do large language models consume so much power? What impact does this have on our environment?
#AI #powerconsumption #AIenergy #AIenvironment #LLMS #largelanguagemodels
#AI #powerconsumption #AIenergy #AIenvironment #LLMS #largelanguagemodels
มุมมอง: 82
วีดีโอ
Watch As Tesla FSD Drives Me Off The Road. (During Rush Hour.) (In the Fast Lane.)
มุมมอง 16521 ชั่วโมงที่ผ่านมา
Watch as Tesla’s Full Self-Driving system unexpectedly veers off the road. I show the footage from a couple of different angles so we can figure out what is to blame. (I have a theory.) Learn more about why the Feds have opened an investigation into FSD. #TeslaFSD #FullSelfDriving #TeslaAutopilot #SunGlare #SelfDrivingCars #TeslaSafety #AutonomousDriving #CollisionAvoidance #TeslaTechnology #AI...
Advanced Voice From ChatGPT- Switching Voices- The Pirate Voice Test
มุมมอง 185หลายเดือนก่อน
In this podcast, ChatGPT Advanced Voice is my co-host. I ask Advanced Voice a series of interview questions while asking it to switch back and forth between different voices and languages. (Talking like a pirate gets old pretty quickly, but Talking like Yoda was a lot of fun.) Stay tuned at the end to hear Advanced Voice speak Icelandic. #ChatGPT #AdvancedVoice #OpenAI #Talklikeapirate #chatgpt...
Testing OpenAI's Advanced Voice Model: Real-World Performance in a Noisy Home Environment
มุมมอง 471หลายเดือนก่อน
In this video, I put OpenAI’s new Advanced Voice model through a real-world test in a home environment filled with everyday background noise. From packing for an event to a playful red cattle dog making sounds in the background, the AI model easily handles it. 🎙️ Just a note: Advanced Voice Mode does have a "Mute" button (for the human), and I briefly used this at one point to mute the noise on...
Using ChatGPT Voice (GPT4o) for a "job" Interview- testing the new model. "Sky" voice vs. "Ember."
มุมมอง 4.5K5 หลายเดือนก่อน
I test the new GPT4o omnimodel using it as a personal coach to prep for a mock job interview. I test the model using the same tests the OpenAI developers tried in their May 19th launch day video. My conclusion? Not all features demonstrated by OpenAI on their launch day are working yet. However, even with this reduced functionality, GPT4o is an EXCELLENT coaching tool. So, when can we expect th...
"Talking Faces" Technology by Microsoft- I Put It To The Test Versus a Human
มุมมอง 1.4K7 หลายเดือนก่อน
Today we explore VASA (Visual Affective Skills Animator), a groundbreaking AI technology from Microsoft. This technology has not been launched yet. VASA can create incredibly lifelike talking face videos from just a single photo and an audio clip, pushing the boundaries of what's possible with AI-generated avatars. Note: I did this in my makeshift office (my son's old bedroom) and used a green ...
How I learn anything using ChatGPT Voice
มุมมอง 2447 หลายเดือนก่อน
Dive into the world of hands-free learning with the voice version of ChatGPT. See how ChatGPT’s voice feature becomes my personal AI tutor while I do tasks I enjoy. Doesn't matter if your hands are full of paint, or you're stirring a bowl of brownies. You can still have a full-on conversation with the voice version of ChatGPT. I demonstrate how in this video. #AIExplained #artificialintelligenc...
AI and the Future of Medicine (Will AI Take Away Healthcare Jobs?)
มุมมอง 318 หลายเดือนก่อน
In this enlightening episode, we delve into the crucial role AI plays in healthcare, from diagnostics to drug discovery, and how healthcare professionals can embrace these advancements to enhance patient care and secure their roles in the future. AI technology is not about replacing healthcare workers but augmenting their skills and supporting their invaluable work. #ArtificialIntelligence #Hea...
How to Work with AI (and Keep Your Job As a Writer)
มุมมอง 148 หลายเดือนก่อน
My inside secrets on how I use AI. I share my best tips in this video. Check out the #deeplearningdaily newsletter for a written version of the information covered in this video. #EmbraceTheFuture #AIAssistedWriting #WritersOfLinkedIn #DeepLearningDaily #WritingSurvivalGuide #AIPodcast #ThrivingWithAI #EthicalAI #HumanTouchWriting #CuriousWriters #KeepLearning #KeepWriting
How I Created A Custom "GPT" by ChatGPT To Do My Daily Work. It's AWESOME. You Can Do It, Too.
มุมมอง 2028 หลายเดือนก่อน
I have future-proofed my work. AI has become an indispensable tool in my creative process. In this video, I share exactly how I did it. #DeepLearningDaily #AIAutomation #FutureOfWork #AIJobLoss #RoboticWorkforce #AIInnovation #ArtificialIntelligence #JobDisruption #TechImpact #AIEmployment #AutomationAnxiety #WorkforceTransformation #DigitalDisruption #AIEthics #CareerAdaptation #AIWorkplace #S...
Will AI Take Your Writing Job?
มุมมอง 398 หลายเดือนก่อน
Writing was one of the first fields to be impacted by large language models. Will AI Take Your Writing Job? In this video, freelance writer, Diana Wolf Torres, talks about why writers are uniquely positioned to succeed in an era of language models. #WillAITakeYourJob #artificialintelligence #AIAndCreativity #FutureOfWriting #AIInnovation #creativecollaboration #AIAutomation #FutureOfWork #AIJob...
The Impact of AI on Work
มุมมอง 208 หลายเดือนก่อน
Welcome to the inaugural episode of our eye-opening series, "Will AI Take Your Job?" 🌟 Dive deep with us into the heart of one of the most pressing questions of our era: the impact of artificial intelligence on the future of work. For my returning subscribers, I am trying out a new text-to-video software called "Pictorly." Because the software is in a trial period, it puts the "Pictorly.AI" wat...
Artificial Intelligence in about 90 Seconds
มุมมอง 488 หลายเดือนก่อน
🚀 Unlock the secrets of AI with this essential vocabulary guide! 🔓 In just 90 seconds, you'll learn the fundamental terms that every AI enthusiast should know. For my returning viewers, this video is a new experiment. I'm trying out a text-to-video software called pictory.ai. You'll notice the watermark in the upper left-hand corner on some of the lighter shots. I also I did a special intro and...
"Decision" Trees- How AI Makes Complicated Decisions
มุมมอง 188 หลายเดือนก่อน
In yesterday's video, we learned about decision trees. Today, we delve deeper into the thicket on how this technology works. #MachineLearning#artificialintelligence #decisiontrees #deeplearningai #deeplearning #deeplearningdaily #techenthusiasts #techexplained #aiexplained #aieducation
"Decisions, Decisions." The AI Sings.
มุมมอง 158 หลายเดือนก่อน
What if you asked the AI to create a song about how it makes decisions? Welcome to "The magic of the decision tree." It's the tree-like structure that represents decisions and their possible outcomes. It's more fun to learn about AI when the AI sings about it. Imagine it like the best parts of school, like when you get to learn by singing... (Or, you realize you've had so much fun learning some...
AI History Lesson: The Evolution Behind the Black Box
มุมมอง 388 หลายเดือนก่อน
AI History Lesson: The Evolution Behind the Black Box
I am Grok! (Grok and Grokking.) That magical moment when you figure out "how to grok."
มุมมอง 498 หลายเดือนก่อน
I am Grok! (Grok and Grokking.) That magical moment when you figure out "how to grok."
Paws and Processors: Understanding AI Through Man's Best Friend
มุมมอง 448 หลายเดือนก่อน
Paws and Processors: Understanding AI Through Man's Best Friend
My Whacky AI-Generated Holiday Card (Why is there a puppy in my hot cocoa?!!!)
มุมมอง 2210 หลายเดือนก่อน
My Whacky AI-Generated Holiday Card (Why is there a puppy in my hot cocoa?!!!)
It was dodging a shadow. FSD IS A BETA.....you had a minor problem. What danger were you in? Did it warn you to take over and you ignore it. You should of taken over sooner...as it is called supervised. 😅 You're acting like your life was in danger.
Elon has greatly over at exaggerated FSD capabilities. The fact is it fails in many very basic situations that humans can easily handle, Just as we see in this video. I've had the full paid for version or as long as the car has existed and I don't even turn it on anymore.
Hold the phone I think they should close down the Tappan-zee bridge east bound during sunset because the sun glare obstructs vision for every human driver. Until everyone has lidar they can drive on that bridge east bound during sunset.
So, the AI doesnt know how many voices it has access to. AI is a fail.
I don't think it is a fail. I think it is more that there is not a limit to how many voices it can do- other than it is copyright limitations on imitating celebrities. Think of it this way: how many people can you imitate? You won't know until you try. So, once you get access to Advanced Voice, just try it out and see how many you can do. If there are some voices you would like me to try, please list them out. I have done more videos. I just need to edit them out. I plan to keep doing them as this is as important tool. Thank you for the feedback.
I experimented the other night, by asking it to create a list of 100 different examples of diverse voice styles/accents, and it nailed every one of them. It had things like, "Sullen goth teen" and "sleazy salesman" and "cheerful carnival barker", "Noir detective", "overenthusiastic gameshow host", and "Transatlantic 1940's American stage actor," et cetera. It nailed them all. I think the reason it can't give an exact number is because some of them seem to be variations or combinations of base accents. For example, there's an English voice (including variations like Cockney or proper English), but there's also an "English Butler," which is just kind of a slightly slower, modified proper British voice. It's almost like each accent has a base accent, but it also has modifiers like speed, pitch, grammar, etc. So in other words, it seems to be capable of modifying voices into variations which play off of the base voice. As examples, you can ask it to speak with a very mild Jersey accent., or a very thick Jersey accent. You can ask it to sound out-of-breath, or "very out of breath, as if running up a steep hill." You can ask it to sound like a "Jersey girl with a very slight hint of Russian accent, who is completely out of breath after running up a very steep hill," or other nuances. I won't go so far as to say there are endless variations, but I don't think it's as simple as just having a list of a couple of hundred accents.
@@ThisEpicLife This is an excellent explanation! Thank you! Sleazy salesman. I look forward to giving that one a try.
@@ThisEpicLife The best use I heard so far had nothing to do with voices. My son and I were stuck in heavy traffic coming back from an appointment and he had a grad class in an hour. He asked Advanced Voice: "What is the difference between ROS I and ROS II?" He then grilled Advanced Voice for the next hour with follow-up questions. By the time I dropped him off at the university, he was completely prepped for his lab. As I was listening in on this conversation, not only did I learn something about Robotic Operating Systems, but I thought: "This is the way the technology should be used."
Holy nonsense batman... do you think you are doing something amazing or important?
If you want to brainstorm and ramble without it interrupting, ask it to remember to only say "mmhmm" if I pause mid sentence. That has been one of the best little tricks I found
Oh, what a great tip! Thank you!
Open AI is also monitoring you for the government. The authors sold us out.
Thank you for sharing! This is good for our kids.
Kids love it for bedtime storytelling and it is very good for developing their verbal skills. Alex and I used it today to learn some Norweigan.
The Text Model You are Using is GPT 4o but the Voice Mode is using GPT 4. Openai will start the Alpha Phase (not Beta) of the new Voice Mode with a small Group of Plus Users at the end of July. In the fall, all Plus Users (including me) should have access to the new Voice Mode. Based on the GPT 4o model.
Thank you, Manuel. I appreciate the clarity, and I am very much looking forward to trying the new version. Admittedly, even in the current version, I find VoiceMode very useful. Getting rid of the latency will be nice, but I don't need any of the other features. I rarely ask my Alexa unit to whisper to me. And, I can't imagine a use case where I need ChatGPT to sing to me.
„I can’t interrupt the model“ The model: „Tap to interrupt“ 😂 (Yes I know that you wanted to interrupt ut with your voice)
You're not wrong. :) I'll admit I was being a total Karen to the poor model while testing it.
It happened to me as well, I paid for a singing, whispering sort of version of ChatGPT-4o, but it does nothing of the sort. I keep wondering when it'll change for the demo video's version, and cannot find the answer anywhere... Your video is 4 weeks old (currently), and still there is no change. It does a fine job, but not as shown on TH-cam.
I still don't have the singing, whispering version. But I still use ChatGPT Voice every day, sometimes multiple times a day. I refer to it as my "Oracle." I'm sure it will be even more useful with the updates, but even in the current form, I find ChatGPT Voice speeds up my workflow. For example, I brainstorm article ideas with ChatGPT Voice while walking the dog. To the other dog walkers, I probably look like I'm on the phone. But, I am actually "on the phone" with ChatGPT- getting some work done.
@@DeepLearningDaily Beware, when I use ChatGPT-4o, I have to tell it it's wrong numerous times. It agrees with me and fixes it's answers, but it is I who must know that something is wrong. In historical facts, in cultural etc.. So I'm not saying it's all bad, but I'm not saying it has no faults, either. So do a fact checking, for your own good. That is before the advanced features...
@@XRos28 I agree. Thank you for pointing this out. You have to know your subject material. I appreciate the heads-up, though. When I ask GPT4o to do research, I will often say: "Do your research" in my prompt. Before I publish anything, I run it through Perplexity and say: "Please fact-check this." (Why do I say 'please?' I don't know. I'm a crazy person.) Perplexity finds an actual source I can verify for everything Gpt4o says. I can also ask 4o to provide sources, but often, the links are fictitious. It saves time to ask Perplexity to do it.
It will roll out in coming weeks. It is still the model
👀
that version is not the one on the demo (GPT 4o). that's only GPT 4
Hi Lights and Colors- It's GPT4o, since I'm a Plus subscriber. But, they just haven't added the new functionality yet. They said the new functionality is coming for Plus subscribers in the next couple of weeks. It's been two weeks, so I've been checking every day. At one point in the video, I hold up my phone so you can see it says GPT4o on the top of the screen. If I do another video, I'll make this point a lot clearer. Thank you so much for the feedback!
@@DeepLearningDaily i see, so it's just the delay. I bet you're excited as everybody else. I am. I find it so out of this world, fiction-come-true stuff. 🥰❤
@@theeyes-fx6ld Yes! You captured it exactly!
This update is going to be one of the most advanced futuristic things that I have witnessed, judging by the GPT4o ChatGPT and Microsoft Copilot videos. Almost doesn't even seem real. I'll believe it when I can finally test it. I'm excited as well
People can be such suckers
This is an interesting general observation of humanity. Do you want me to pose that question to the Ember voice in my next video?
@@TOMTOM-zj5xj I love it! KAREN AI-that is brilliant. And, yes, you are right. I put the voice model through its paces in this video.
Great video! Another interesting thing is that on OpenAi's webpage where it features the video examples of what this new ChatGPT 4o is supposedly able to do, if you scroll below the videos on that web page, there is a section called "Explorations of Capabilities" with a drop down menu showcasing 16 examples of what ChatGPT 4o "omni" model is supposed to be able to do. The examples include the "prompt" that was used and the dazzling "Output". I tried to replicate one of the examples: "Poetic typography with iterative editing 2" where the prompt asks for handwritten text of a 3 verse poem (the poem is provided to ChatGPT 4o in 3 separate verses with a space between each verse). When I copy & paste the exact same prompt into my "Pro" subscription ChatGPT 4o model, it Outputs random number of illegible, gibberish lines of unrecognizable 'text' where after many "re-rolls", I'm lucky if even 3 words in the wall of text are correctly spelled English words. The rest of the letters are malformed, often have the shape of 'typeface fonts' ... which means the model is not only incapable of creating "handwritten pen and ink handwriting", but it is also incapable of replicating the 3-verse poem in ordinary "typeface font". Also, the ChatGPT 4o model has not attempted to fulfill the prompt by replicating the 12 line poem, 4 lines per verse with 3 total verses. Instead with each "re-roll", it creates a solid wall of text, sometimes 23 lines, sometimes 19, sometimes 17, etc. I then try to test ChatGPT 4o's "image recognition and analysis" by asking it to tell me how many lines there are in the most recent Output image that it has created. It's unable to count the number of lines. I ask it to write out the lines that appear in the Output image, and instead, it writes the lines of the poem that I provided in the first prompt. Then, to test ChatGPT 4o's ability to count, I ask it to replicate the Output image, but to place numbers at the end of each line of (illegible, gibberish) text. I tell it that it can either do this using DALL-E or using code, whichever method is easier and more efficient to achieve the replicated image with numbered lines of text. It is unable to achieve this task. I then simplify the task by asking it to simply write out in a text Output what it sees in the Output image, and place consecutive numbers at the end of each line, beginning with number "1". It is unable to count beyond "9". First off, it does not correctly replicate the lines of gibberish text from the image, it hallucinates alternate lines of text ... and secondly, it adds numbers 1 - 9 for the first lines, but then for the 10th line, it instead puts "0" and begins repeating the numbers again 1 - 9 ... so it's unable to Output a consecutive numbering request that goes beyond "9". Other TH-camrs seem to be blindly accepting what OpenAi is proclaiming this new ChatGPT 4o model is supposedly able to do. No one has yet tried replicating the examples from OpenAi's own website where it provides the prompts it uses and boasts the glitzy, dazzling results. OpenAi has only stated that the updated "speech" function will be rolled out "later on" (at some vague, unspecified date). But it has not said that all other functionality is still missing from this brand new ChatGPT 4o model, in fact the website states that ChatGPT 4o's text and image functionality has been rolled out to "Pro" subscribers (of which I am one). Yet this functionality is woefully and pathetically not yet working. So my question to the Tech community that has been covering Ai and without testing and trying to replicate the claims ... how is it possible, if as stated on OpenAi's own website that this new ChatGPT 4o "Omni" model is a brand new "single" model (where all functionality of text, image recognition, image analysis, 'sight' (via camera or viewing desktop), voice and speech ... are all fully integrated with one another, entwined together into one single model built from the ground up, which means it is entirely different from the prior ChatGPT 4 that used multiple models that then had to communicate with one another which caused latency and loss of information as it travelled along a series of multiple pipelines. So if this brand new model is "fully integrated" into a single model ... then how is it possible that this "single" model is missing a bunch of pieces of basic functionality that has not been rolled out ... and that OpenAi has NOT provided any specific (transparent) timeline when these "separate" functions and features will be brought on board ... !!?? But most importantly, how is it possible that the model that OpenAi has so far rolled out appears to be some old, deprecated, non-functioning model masquerading as "ChatGPT 4o". OpenAi on the same ChatGPT 4o example web page states that: "we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network." ... so if these are not yet functioning nor rolled out ... why in the world has OpenAi added this "pretend" ChatGPT 4o to "Pro" subscribers accounts to make it seem as though they've received a brand new model ... but if the user takes the time to try replicating OpenAi's examples ... none of them are working at this point in time ... ??!!
they have given a button to interrupt, you fail at the interview lol
Oh no! You mean I have to stay home and play with awesome Star Wars stuff? I'm heartbroken! LOL. You're not wrong, though. What I was trying to do was imitate all of the tests done by OpenAI during the launch day. If you haven't seen the video, two of the developers ran the Sky voice through several tests. One of the tests they frequently did was interrupt the model. Thank you for the feedback. th-cam.com/video/DQacCB9tDaw/w-d-xo.html&ab_channel=OpenAI
@@DeepLearningDailyaside from that, you did a good benchmark of what it can do with GPT-4o, before the new update, since many people didn’t know how extensive this feature already has become. I think the comparison is good, to know exactly what changes in capabilities when the update does come.
@@stateportSound_wav Thank you! That is very kind. I had fun doing the video. It's a great feature, and I use it all the time. I'm looking forward to the update. They are a week late as they said it would be two weeks (for the Plus subscribers.) They are likely delayed to the "Sky" incident.
@@DeepLearningDaily yeah, another comment said they gave an update somewhere, I think blog post delaying it from the original “weeks” to “months”, but I hope that’s not the case
@@stateportSound_wav Where is this blog that says months because I never heard or read that. I think if they changed it to months, there would be more of us Plus subscribers angry and talking about that all over the internet.
Thank you for the video. One question is, is it free version or paid version?
ahhhh, how did you miss "Tap To Interrupt"?
The new model shouldn't need tap to interrupt. You can overlap your voice to interrupt it like in a real conversation. She didn't miss the text on screen.
Did you not pay attention to the release video at all? They literally said the features would be rolled out incrementally over the next few months. The only thing that’s available to free users ChatGPT 4. I got access to desktop model on Monday.
Actually, they said it would be a few weeks for their Plus subscribers. I've been checking everyday. I am very excited for the new features.
@@surfercouple they changed it to months now
@@surfercouple you have to stay updated with interviews that leadership at OpenAI do. That’s the best way to stay updated.
OpenAi clearly stated on May 13, 2024 (the day of the Livestream Keynote ChatGPT 4o launch), that ChatGPT 4o was available to "Pro", "Team" and "Enterprise" subscribers as of May 13th, but would not yet have the new "speech" functions. However, it stated that the text and image features were already working. This is not true. If anyone takes the time to go to OpenAi's webpage that showcases the examples of what this new "omni" model can do ... scroll down below the video examples to a section called "Explorations of Capabilities" where there is a drop down menu of 16 examples, showing the exact input "Prompt" and "Output" results. I tried replicating these "amazing, dazzling" results by copying & pasting the exact "Prompt" and instead of getting a mind-blowing result ... my "Pro" ChatGPT 4o output "gibberish" illegible long form handwritten text where letters were malformed, lucky if even 2 or 3 words of a 12 line poem were recognizable ... ChatGPT 4o failed miserably at replicating the example!! Go ahead try it yourself and let me know if you are able to replicate any of the 16 examples accurately.
@@satoriscope Wise advice. I will update my description to make it clear I have a Plus subscription and these features are coming to the Plus subscribers in the upcoming weeks.Thank you for the clarification. You make a valid point. I realize now that the readers of my newsletter all tend to be Plus subscribers, but that wouldn't be the case here on TH-cam since it is a much broader audience.
You are so dumb. but at the same time, you are kinda old so you've got an excuse at least. this is the old gpt
This does not appear to be the new voice model. It will be rolling out in the next few months. I’d like to see another analysis once you gain access.
Me, too! I look forward to making that video!
I asked ChatGPT today and it told me it should be coming in alpha mode to plus users in the next 2 weeks. Omg how exciting, what will you try first ? Incant to test it as a tutor :)
Yes! ChatGPT is already an excellent tutor, but it will be so much better with these new features. I'd like to increase my understanding of quantum computing. But, currently, I can't interrupt the model hands free when it starts going off on a tangent with something I already know. Hands-free is important to me because I'm usually painting or sanding when I'm in my shed. (So, my hands are not "free.") This new functionality will make the model a lot more useful. I can imagine myself asking ChatGPT just about everything from now on. One note- just remember to ask ChatGPT: what was your source for that info? So, if ChatGPT told you alpha mode is coming in two weeks, then push back at the model and say: How did you know that? Where did that information come from? If the model is making it up, it will (usually) apologize. If it provides you with a source, ask for a link that you can verify yourself. I always push back at the model to make sure its not making stuff up. It is getting more and more obvious to me these days when it is, but that is probably because I use AI so much.
Promo_SM
The one thing that will never change is that this is made by Microsoft and thus is a massive security threat from their horrendous programming that will allow for systems to be compromised.
Hi Thomas- If it's any consolation, the "Talking Faces" technology is not out yet. Microsoft announced this technology- meaning they had created it. However, they haven't launched it. If past experience holds out, these tech companies usually wait about six months from the time they launch a white paper to the time they launch the product.
The head moves around like it's from a hand puppet and it feels very overperformed, sprinkled with empty phrases.
Yes, I found the pausing in the speech to be somewhat strange, especially when I tried to match up to it. I couldn't do it, even though I did seven takes of my video. But, to Microsoft's credit, this is generation one of their technology. It's only going to get better from here.
Holy schnikes. This really truly marks the end of the internet. It will take a while for the general public to figure it out. But this basically means that all the content I consume online currently will all be fake ten years from now. We, as humanity, will drift further and further away from human contact becoming more and more isolated. This is absolutely terrifying. The internet used to be an oasis of reality, oddly enough. Cable tv turned into 24/7 lies and propaganda. And now you won't be able to find anything that's real on the internet. Time to build that bunker.
I saw this and thought: "I'm seeing the future right here. All newscasters in the future- and likely all podcasters, too- will be talking faces like this one. We will eventually regards this as normal."
interesting tidbit - however, if I may recommend something? You seem comfortable and interested in exhibiting yourself in the media space to communicate with an audience. Your webcam, however, seems straight out 1999 and makes the whole experience rather unpleasant (it's so bad I really thought _you_ were the talking AI face for the first 10 seconds haha). Even for casual conversations 1x per year with my mom or some interview I make sure to have a decent, modern webcam that can do 1080 and gives a more photographic rendering. Just a tip.
Thanks for the tidbit! I'll work on an upgrade! It's probably the lighting. I was trying to match her background so I just threw up a tablecloth in the background as a mock green screen. I normally don't bother with one. The background you see behind me was just a background I downloaded from Zoom. In the real world, there is a dresser behind me in my office.
It's come lightyears from the first AI animated photos, still falls well within the uncanny valley for me. The most immediately glaring falsehoods are her hair and skin. Her hair moves like a solid helmet parented to the head, while none of the strands or tendrils bend, wobble, fall, or flex. Especially telling with the bit that kind of peeks out onto the neck. Also telling is the skin of her face and neck. While the animation is very realistic, the skin remains mostly rigid. Her brows move, but her forehead does not. The same is true for the way her skin should stretch and contract around her cheeks when talking. Less noticeable, therefore not one of my main two, is the fact that while her head makes subtle turns, her teeth do not change perspective. The two front teeth always face front and center, like one of those paintings where the eyes follow you around the room.
If you go to the Microsoft page for VASA-1, they demonstrate how this technology can be used with a pencil sketch or to animate the Mona Lisa. It's... unsettling. It does have a very Harry Pottery talking photos effect.
Kinda sucks they are withholding it to the general public, but I of course understand why.. it's just unfair to those who want to just try it out and have fun with it, I like creating characters and it'd be cool to actually speak to them and see them actually talking
There are other technologies like this out there. Similar lip sync and head movement technology is available from AI company Runway, Nvidia’s Audio2Face AI application, Google’s Vlogger AI launched in March, and Emo AI by China’s Alibaba.
Your ego got a subscriber.
My ego thanks you!
Microsoft will launch it eventually just because other companies will. The same can be said about other so called AI tools...
I agree! Companies will launch a white paper, and the product will usually come out about six months later. If VASA-1 follows this pattern, we can expect this technology to be available by the end of the year. It occurred to me today that you could use this technology to bring a dead relative back to life. I'm not sure if that's a good thing or not.
@@DeepLearningDaily as I have seen in the Stable Diffusion area and using these tools... projects like this have a weird % rate of actually going from white paper to usable tool, ofc this will become a tool of sorts, but one can't be certain it will be this specific one. Knowing corpospace, it will be kept in a vault with the low quality censored weights sold via subscription services ala chatgpt and other types of platforms like that
I’m looking for people that don’t want to learn ai and rather sell their companies , if you have any ideas on how to find them, would you make a video on that?
I'm on the opposite end of the spectrum, since I'm an educator encouraging people to learn AI. It's a skill everyone can learn.
This song is great! ❤
Thank you! My husband is not a fan. :) I generate the custom music for my videos using Suno.AI. There is a free version of it and you can generate song clips up to 40 seconds. Here is the "full" version of "Will AI Take My Job?" (Well, all 40 seconds of it.) app.suno.ai/song/1432b723-6965-4237-b837-dae39a11ffd2
Using AI to describe AI is weird but cool. I prefer you not to speed up things..
Thanks for the feedback! I actually slowed down the speed on some of the animation. But, I'm likely guilty of moving too quickly on the still shots while trying to time the music and the images. I'll keep trying to improve on each day's video. I appreciate you providing the feedback. It's very helpful. And, you're right. There are things about this new world of AI we find ourselves in that are still very weird. Some of the art the AI generates is... odd. Sometimes I edit it and sometimes I leave it in its original form because that is what AI art looks like sometimes. It is an imperfect technology that some weird stuff, at times.