106
42 275

CHOIS: Controllable Human-Object Interaction Synthesis [Han EunGi]

32:10

Learning-based Axial Video Motion Magnification (ECCV'2024)

5:00

BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models (ECCV'2024)

5:56

NeuFace: A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optim. (TMLR'2024)

4:41

RoHM: Robust Human Motion Reconstruction via Diffusion [Han EunGi]

19:14

CAT3D: Create Anything in 3D with Multi-View Diffusion Models [Kim Yu-Ji]

35:06

MMM: Generative Masked Motion Model [Oh Hyun-Bin]

2024.09.19. P-AMI Weekly Seminar
[Reviewed Paper]
MMM: Generative Masked Motion Model (CVPR 2024, Highlight)
[Speaker]
Oh Hyun-Bin
[Tags]
Text-to-motion generation, Human motion generation, Human motion editing, Masked modeling

มุมมอง: 36

วีดีโอ

CHOIS: Controllable Human-Object Interaction Synthesis [Han EunGi]

32:10

CHOIS: Controllable Human-Object Interaction Synthesis [Han EunGi]

มุมมอง 519 ชั่วโมงที่ผ่านมา

2024.09.19. P-AMI Weekly Seminar [Reviewed Paper] 3D Human-Object Interaction Synthesis - CHOIS: Controllable Human-Object Interaction Synthesis (ECCV 2024) - CG-HOI: Contact-Guided 3D Human-Object Interaction Generation (CVPR 2024) [Speaker] Han EunGi [Tags] 3D Human-Object Interaction Synthesis (HOI), 3D Human Motion Generation

Learning-based Axial Video Motion Magnification (ECCV'2024)

5:00

Learning-based Axial Video Motion Magnification (ECCV'2024)

มุมมอง 5816 ชั่วโมงที่ผ่านมา

The European Conference on Computer Vision (ECCV) 2024 [Paper] Learning-based Axial Video Motion Magnification axial-momag.github.io/axial-momag/ [Speaker] Kwon Byung-Ki [Tags] Video Motion Magnification

BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models (ECCV'2024)

5:56

BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models (ECCV'2024)

มุมมอง 6814 วันที่ผ่านมา

The European Conference on Computer Vision (ECCV) 2024 [Paper] BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models beafbench.github.io/ [Speaker] Moon Ye-Bin [Tags] Vision-language Models (VLMs), Hallucinations, Evaluation Benchmark

NeuFace: A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optim. (TMLR'2024)

4:41

NeuFace: A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optim. (TMLR'2024)

มุมมอง 72หลายเดือนก่อน

Transactions on Machine Learning Research (TMLR) 2024 [Paper] A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization kim-youwang.github.io/neuface [Tags] 3D Face video dataset, Optimization, Neural re-parameterization

RoHM: Robust Human Motion Reconstruction via Diffusion [Han EunGi]

19:14

RoHM: Robust Human Motion Reconstruction via Diffusion [Han EunGi]

มุมมอง 103หลายเดือนก่อน

2024.07.12. P-AMI Weekly Seminar [Reviewed Paper] RoHM: Robust Human Motion Reconstruction via Diffusion [Speaker] Han EunGi [Tags] 3d human motion reconstruction, diffusion

CAT3D: Create Anything in 3D with Multi-View Diffusion Models [Kim Yu-Ji]

35:06

CAT3D: Create Anything in 3D with Multi-View Diffusion Models [Kim Yu-Ji]

มุมมอง 4943 หลายเดือนก่อน

2024.06.21 P-AMI Weekly Seminar [Reviewed Paper] CAT3D: Create Anything in 3D with Multi-View Diffusion Models [Speaker] Kim Yu-Ji [Tags] diffusion model, 3d creation

Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization & .. (CVPR'2024)

6:59

Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization & .. (CVPR'2024)

มุมมอง 2763 หลายเดือนก่อน

The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024 [Paper] 🎨 Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering kim-youwang.github.io/paint-it [Speaker] Kim Youwang [Tags] Text-to-texture synthesis, Cross-modal generation, Physically-based rendering

Is ImageNET Worth 1 Video? Learning Strong Image Encoders from 1 Long Unlabelled Video [Jo Won-Jun]

46:21

Is ImageNET Worth 1 Video? Learning Strong Image Encoders from 1 Long Unlabelled Video [Jo Won-Jun]

มุมมอง 2043 หลายเดือนก่อน

2024.05.31 P-AMI Weekly Seminar [Reviewed Paper] Is ImageNET worth 1 Video? Learning Strong Image Encoders From 1 Long Unlabelled Video [Speaker] Jo Won-Jun [Tags] self-supervised learning, video understanding

OneLLM: One Framework to AlignAll Modalities with Language [Kim Sung-Bin]

28:17

OneLLM: One Framework to AlignAll Modalities with Language [Kim Sung-Bin]

มุมมอง 2004 หลายเดือนก่อน

2024.05.03 P-AMI Weekly Seminar [Reviewed Paper] OneLLM: One Framework to AlignAll Modalities with Language [Speaker] Kim Sung-Bin [Tags] large language models, multi-modal learning, multi-modal large language models

DiffusionLight: Light Probes for Free by Painting a Chrome Ball [GeonU Kim]

1:01:15

DiffusionLight: Light Probes for Free by Painting a Chrome Ball [GeonU Kim]

มุมมอง 2175 หลายเดือนก่อน

[Reviewed Paper] DiffusionLight: Light Probes for Free by Painting a Chrome Ball [Speaker] GeonU Kim [Tags] diffusion model, stable diffusion, environment map

Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation [Kim Ji-Yeon]

20:45

Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation [Kim Ji-Yeon]

มุมมอง 4658 หลายเดือนก่อน

2024.01.12 P-AMI Weekly Seminar [Reviewed Paper] Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation [Speaker] Kim Ji-Yeon [Tags] monocular depth estimation, diffusion models

Accurate 3D Face Reconstruction with Facial Component Tokens [Han EunGi]

35:22

Accurate 3D Face Reconstruction with Facial Component Tokens [Han EunGi]

มุมมอง 2118 หลายเดือนก่อน

2023.12.29 P-AMI Weekly Seminar [Reviewed Paper] Accurate 3D Face Reconstruction with Facial Component Tokens [Speaker] Han EunGi [Tags] 3D face reconstruction, Vision Transformer

Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [Kim Yu-Ji]

24:01

Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [Kim Yu-Ji]

มุมมอง 2318 หลายเดือนก่อน

2023.12.08 P-AMI Weekly Seminar [Reviewed Paper] Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [Speaker] Kim Yu-Ji [Tags] text-to-image, text-to-3D, textured 3d mesh generation

Bayes' Rays: Uncertainty Quantification for Neural Radiance Fields [Lee Hyoseok]

54:58

Bayes' Rays: Uncertainty Quantification for Neural Radiance Fields [Lee Hyoseok]

มุมมอง 4678 หลายเดือนก่อน

2023.12.29 P-AMI Weekly Seminar [Reviewed Paper] Bayes' Rays: Uncertainty Quantification for Neural Radiance Fields [Speaker] Lee Hyo Seok [Tags] uncertainty, neural radiance field

16:28

Visual Instruction Tuning [Moon Ye-Bin]

มุมมอง 7429 หลายเดือนก่อน

Visual Instruction Tuning [Moon Ye-Bin]

LaughTalk: Expressive 3D Talking Head Generation with Laughter (WACV'2024)

9:32

LaughTalk: Expressive 3D Talking Head Generation with Laughter (WACV'2024)

มุมมอง 3129 หลายเดือนก่อน

LaughTalk: Expressive 3D Talking Head Generation with Laughter (WACV'2024)

Improving Fairness in Facial Albedo Estimation [Oh Hyun-Bin]

37:57

Improving Fairness in Facial Albedo Estimation [Oh Hyun-Bin]

มุมมอง 1459 หลายเดือนก่อน

Improving Fairness in Facial Albedo Estimation [Oh Hyun-Bin]

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning [Hyunwoo Ha]

28:29

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning [Hyunwoo Ha]

มุมมอง 1619 หลายเดือนก่อน

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning [Hyunwoo Ha]

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generator [Kim Sung-Bin]

41:42

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generator [Kim Sung-Bin]

มุมมอง 40110 หลายเดือนก่อน

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generator [Kim Sung-Bin]

Your Diffusion Model is Secretly a Zero-Shot Classifier [GeonU Kim]

33:55

Your Diffusion Model is Secretly a Zero-Shot Classifier [GeonU Kim]

มุมมอง 73010 หลายเดือนก่อน

Your Diffusion Model is Secretly a Zero-Shot Classifier [GeonU Kim]

Image-guided Depth Completion: A Non-linear Filters, Convolutions, and Transformer [Kim Kyeongseon]

52:17

Image-guided Depth Completion: A Non-linear Filters, Convolutions, and Transformer [Kim Kyeongseon]

มุมมอง 25310 หลายเดือนก่อน

Image-guided Depth Completion: A Non-linear Filters, Convolutions, and Transformer [Kim Kyeongseon]

50:50

Mixture of Experts [Choi Wonseok]

มุมมอง 13310 หลายเดือนก่อน

Mixture of Experts [Choi Wonseok]

Scratching Visual Transformer's Back with Uniform Attention (ICCV'2023)

5:02

Scratching Visual Transformer's Back with Uniform Attention (ICCV'2023)

มุมมอง 14711 หลายเดือนก่อน

Scratching Visual Transformer's Back with Uniform Attention (ICCV'2023)

TextManiA: Enriching Visual Feature byText-driven Manifold Augmentation (ICCV'2023)

5:07

TextManiA: Enriching Visual Feature byText-driven Manifold Augmentation (ICCV'2023)

มุมมอง 13211 หลายเดือนก่อน

TextManiA: Enriching Visual Feature byText-driven Manifold Augmentation (ICCV'2023)

DIAGNOSING AND RECTIFYING VISION MODELS USING LANGUAGE [Moon Ye-Bin]

36:08

DIAGNOSING AND RECTIFYING VISION MODELS USING LANGUAGE [Moon Ye-Bin]

มุมมอง 155ปีที่แล้ว

DIAGNOSING AND RECTIFYING VISION MODELS USING LANGUAGE [Moon Ye-Bin]

Neuralangelo: High-Fidelity Neural Surface Reconstruction [Kim Yu-Ji]

47:03

Neuralangelo: High-Fidelity Neural Surface Reconstruction [Kim Yu-Ji]

มุมมอง 940ปีที่แล้ว

Neuralangelo: High-Fidelity Neural Surface Reconstruction [Kim Yu-Ji]

Neural Fields for Representing Human Pose & Motion [Kim Youwang]

1:01:17

Neural Fields for Representing Human Pose & Motion [Kim Youwang]

มุมมอง 292ปีที่แล้ว

Neural Fields for Representing Human Pose & Motion [Kim Youwang]

34:11

LLMs for Computer Vision [Lee Hyun]

มุมมอง 184ปีที่แล้ว

LLMs for Computer Vision [Lee Hyun]

Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment (CVPR'2023)

7:45

Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment (CVPR'2023)

มุมมอง 628ปีที่แล้ว

Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment (CVPR'2023)

ความคิดเห็น

@rezarawassizadeh4601 3 หลายเดือนก่อน
I have difficult time understanding SAM, thank you for your video, but I think step 8 is not 3-layer MLP it is three MLP for the three different masks.
@최성준-s6h 5 หลายเดือนก่อน
세미나 공유 감사합니다!
@ylab3891 6 หลายเดือนก่อน
이제 영상 안올리는건가요? 흑흑
@EigenA 6 หลายเดือนก่อน
Great video!
@bakersaga8186 7 หลายเดือนก่อน
Hi, I am also trying to understand this paper! When the deformation field is defined as a grid, do you know how the deformation values are updated? For example in the rendering equation, when a deformation value is added to the position, how is that value defined? Is the same deformation value added to [x,y,z] for every point that is defined in the grid? Or is it updated? If updated, do you know how?
@ylab3891 8 หลายเดือนก่อน
발표 감사드립니다 :D
@ylab3891 8 หลายเดือนก่อน
논문 발표 감사드립니다 ㅎㅎ 😀
@er-wl9sy 10 หลายเดือนก่อน
Great. Please enable english sub
@ylab3891 10 หลายเดือนก่อน
bilateral filter에 대해서 배워갑니다
@ylab3891 10 หลายเดือนก่อน
세미나 공유 감사드립니다. :D
@김진혁-k8e ปีที่แล้ว
훌룽한 발표 공유해주셔서 감사합니다.
@remath3166 ปีที่แล้ว
비전공자인데요 마지막에 text to mask task 가 제대로 연구개발되면 mask to text task 로 ocr도 가능합니까?
@SpatialAIKR ปีที่แล้ว
설명 너무 잘 해주셔서 감사합니다 ㅎ 명확하지 않았던 부분이 많이 사라지는 느낌입니다 ㅎ 최근에 코드가 공개되었던데 이제 코드와 같이 한번 분석해봐야겠네요 ㅎ 감사합니다 🙂
@poganzie ปีที่แล้ว
@ylab3891 ปีที่แล้ว
양질의 정보 감사합니다!
@whypushhh ปีที่แล้ว
초반에 누끼랑은 다르다고 했는데 어떤부분이 다른걸까요?
@AMILabPOSTECH ปีที่แล้ว
저희가 흔히 생각하는 누끼는 image matting task와 더욱 관련이 있습니다. SAM을 이용한 matting model인 Matting Anything도 release되었으니, 아래 링크를 참고하시면 좋을 것 같습니다. github.com/SHI-Labs/Matting-Anything arxiv.org/pdf/2306.05399.pdf
@유영재-c9c ปีที่แล้ว
실제로 meta에서 배포한 모델을 훈련했을 당시에 프롬프트는 어떻게 줬는지에 대해선 알 수 있나요?
@AMILabPOSTECH ปีที่แล้ว
meta에서 SAM에 대한 training code를 따로 공개하지 않아, 논문에 적혀진 학습 과정을 동일하게 따랐을 지에 대해서는 말씀을 드리기가 어렵습니다. Promptable segmentation task에 대한 detail은 Segment Anything 논문 Appendix A.의 Training algorithm paragraph 또는 업로드된 영상 내의 Segment Anything Task 설명 부분을 참고해주시면 좋을 것 같습니다.
@jin-hwakim2543 ปีที่แล้ว
잘 봤습니다~ 👏
@fluffy_shark_studio ปีที่แล้ว
공부하고싶은 분야인데 매우 유익했습니다
@ylab3891 2 ปีที่แล้ว
인사이트에 도움되는 좋은 세미나 공유 감사드립니다.
@washedtoohot 2 ปีที่แล้ว
I wish English captions were available 🥲
@박존-b2j 2 ปีที่แล้ว
질문하는게 정말 재미있어보이고 보기 좋네요..
@sutharsanmahendren1071 2 ปีที่แล้ว
I can't understand the language but the slides are very informative. Can you share the slides? Thank you.
@carlosnavarro-cn 2 ปีที่แล้ว
Hello, is there a way to really speed up the google colab version? It seems that the default 2000 iterations are not enough for finishing generating an object in most cases, and going beyond that triggers more than 24-hour wait times. Thank you.
@ylab3891 2 ปีที่แล้ว
발표하시는 분 내용도 좋고, 중간에 교수님께서 말씀하시는것도 좋고 좋은 세미나 공유해주셔서 감사합니다 꾸벅 :D
@ylab3891 2 ปีที่แล้ว
채널 재밌고 유익하게 보고있습니다. :D 좋은 논문들을 소개해주셔서 감사합니다
@ylab3891 2 ปีที่แล้ว
감사합니다 :D 많이 배우고 갑니다
@ylab3891 2 ปีที่แล้ว
올려주셔서 감사합니다 :D
@ONDANOTA 2 ปีที่แล้ว
you should put some tags like "3d" "ai" "machine learning" "mesh" etc
@ONDANOTA 2 ปีที่แล้ว
cool content, but an english version would also be good
@yikezhan4987 2 ปีที่แล้ว
Good presentation and it would be better if it‘s in English.
@kadadtemha 2 ปีที่แล้ว
Hello, can you share the presentation?
@동그리-n1w 3 ปีที่แล้ว
전 너무너무 어려운데 대단하시네요..! 딥러닝 자체를 잘 모르는 것 같아요.. 공유해주셔서 감사합니다.
@princekwesioseiaboagye 3 ปีที่แล้ว
Can you include English subtitles

AMI Lab POSTECH

ความคิดเห็น