Senior Applied Scientist @ Amazon Artificial General Intelligence

About Me

Hi, my name is Gukyeong Kwon. I am a Senior Applied Scientist at Amazon Artificial General Intelligence (AGI). I completed M.S. and Ph.D. in School of Electrical and Computer Engineering at Georgia Tech in 2018 and 2021, respectively, under the supervision of Dr. Ghassan AlRegib. My research interests are broadly machine learning, computer vision, and image/video processing. Recently, I have primarily focused on multi-modal representation learning for vision and language.

Please check my CV and Google scholar profile for more information.

News:

Nov. 29, 2023: Amazon Titan Multimodal Embeddings foundation model which I contributed is finally launched.
Oct. 9, 2023: I am promoted to Senior Applied Scientist at Amazon.
Jan. 20, 2023: One paper is accepted for publication at ICLR 2023.
May 30, 2022: I receive a CSIP Outstanding Research Award in recognition of my Ph.D. research at Georgia Tech.
Jan. 30, 2022: One paper is accepted for publication at IEEE Transactions on Image Processing.
Jan. 11, 2021: I am joining Amazon Web Services (AWS) AI Labs as a full-time Applied Scientist.
Oct. 28, 2020: Our paper received the Top Viewed Special Session Paper Award at ICIP 2020.
Jul. 3, 2020: Our papers is accpeted for publication at ECCV 2020.
May 15, 2020: Two papers are accpeted for publication at ICIP 2020.
Sept. 24, 2019: Our paper won the Best Paper Award (top 0.1%) at the IEEE International Conference on Image Processing (ICIP) 2019.
Apr. 30, 2019: Our paper is accepted for publication at ICIP 2019.
May 4, 2018: Our paper is accepted for publication at ICIP 2018.
Nov. 14, 2017: Our paper is accepted for publication at the 31st Conference on Neural Information Processing Systems (NIPS), Machine Learning for Intelligent Transportation Systems Workshop, 2017.
Mar. 29, 2017: Our paper is selected as a Finalist of the World’s FIRST 10K Best Paper Award (top 3%) in the IEEE International Conference on Multimedia and Expo (ICME) 2017.

Experience

Amazon AGI

Senior Applied Scientist

October 2023 - Present

Develop artificial general intelligence foundation models and services.

AWS AI Labs

Applied Scientist

January 2021 - October 2023

Developed Amazon Titan Multimodal Embeddings foundation model which can perform accurate and contextually relevant vision language search.
Contributed to the overall pipeline of large-scale model training including data preparation and processing, model training, and evaluation.

AWS AI Labs

Applied Scientist Intern

May 2020 - August 2020

Conducted research on multimodal represenation learning for vision and language.
Developed regularization techniques for two-stream BERT models and achieved improved performance in visual question answering, caption-based image retrieval, and referring expressions.

Panasonic Automotive

Deep Learning Research Intern

May 2018 - July 2018

Developed deep learning-based algorithms for drivers’ misbehavior detection in autonomous vehicles.
Focused on driver’s pose estimation and hand detection algorithms using Tensorflow and C++.

Georgia Tech

Graduate Research/Teaching Assistant

January 2016 - December 2020

Developed algorithms to detect accident events occurring in driving scenes to ensure safe autonomous driving.
Proposed abnormal object detection algorithms for autonomous vehicles.
Introduced a large-scale traffic sign recognition dataset for robust visual understanding under challenging conditions.
Developed a perceptual video quality assessment (VQA) metric which achieved the state-of-the-art performance in estimating the impact of visual distortions on human perception.

Publications

G. Kwon, Z. Cai, A. Ravichandran, E. Bas, R. Bhotika, and S. Soatto, “Masked Vision and Language Modeling for Multi-modal Representation Learning,” International Conference on Learning Representations (ICLR), 2023.

[arXiv]

Z. Cai, G. Kwon, A. Ravichandran, E. Bas, Z. Tu, R. Bhotika, and S. Soatto, “X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks,” In Proceedings of the European Conference on Computer Vision (ECCV), 2022.

[arXiv] [GitHub]

G. Kwon and G. AIRegib, “A Gating Model for Bias Calibration in Generalized Zero-shot Learning,” In IEEE Transactions on Image Processing, 2022.

[arXiv] [GitHub]

G. Kwon, M. Prabhushankar, D. Temel, and G. AIRegib, “Backpropagated Gradient Representations for Anomaly Detection,” In Proceedings of the European Conference on Computer Vision (ECCV), 2020.

[arXiv] [GitHub] [Short Video] [Slides]

G. Kwon, M. Prabhushankar, D. Temel, and G. AIRegib, “Novelty Detection Through Model-based Characterization of Neural Networks,” 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates (UAE), 2020.

[arXiv] [GitHub] [Slides] [Video]

M. Prabhushankar, G. Kwon, D. Temel, and G. AIRegib, “Contrastive Explanations in Neural Networks,” 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates (UAE), 2020. (Top Viewed Special Session Paper Award)

[arXiv] [GitHub] [Slides] [Award]

G. Kwon*, M. Prabhushankar*, D. Temel, and G. AIRegib, “Distorted Representation Space Characterization Through Backpropagated Gradients,” 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 2019, pp. 2651-2655. (* : equal contribution, Best Paper Award (top 0.1%))

[arXiv] [GitHub] [Poster]

M. Prabhushankar*, G. Kwon*, D. Temel, and G. AIRegib, “Semantically Interpretable and Controllable Filter Sets,” 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, 2018, pp. 1053-1057. (* : equal contribution)

[arXiv] [GitHub] [Poster]

D. Temel, G. Kwon*, M. Prabhushankar*, and G. AlRegib, “CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign Recognition,” MLITS workshop in Neural Information Processing Syste (NIPS), Long Beach, U.S.A, December 2017. (* : equal contribution)

[arXiv] [GitHub] [Poster]

M. A. Aabed, G. Kwon, G. AlRegib, “Power of tempospatially unified spectral density for perceptual video quality assessment,” 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, 2017, pp. 1476-1481. (Finalist of the World’s FIRST 10K Best Paper Award (top 3%))

[arXiv] [GitHub] [Slides] [Award]