Boosting image captioning with attributes T Yao, Y Pan, Y Li, Z Qiu, T Mei Proceedings of the IEEE international conference on computer vision, 4894-4902, 2017 | 585 | 2017 |
Exploring visual relationship for image captioning T Yao, Y Pan, Y Li, T Mei Proceedings of the European conference on computer vision (ECCV), 684-699, 2018 | 537 | 2018 |
X-linear attention networks for image captioning Y Pan, T Yao, Y Li, T Mei Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020 | 191 | 2020 |
Transferrable prototypical networks for unsupervised domain adaptation Y Pan, T Yao, Y Li, Y Wang, CW Ngo, T Mei Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 187 | 2019 |
Incorporating copying mechanism in image captioning for learning novel objects T Yao, Y Pan, Y Li, T Mei Proceedings of the IEEE conference on computer vision and pattern …, 2017 | 133 | 2017 |
Jointly localizing and describing events for dense video captioning Y Li, T Yao, Y Pan, H Chao, T Mei Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2018 | 126 | 2018 |
Hierarchy parsing for image captioning T Yao, Y Pan, Y Li, T Mei Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 105 | 2019 |
Temporal deformable convolutional encoder-decoder networks for video captioning J Chen, Y Pan, Y Li, T Yao, H Chao, T Mei Proceedings of the AAAI conference on artificial intelligence 33 (01), 8167-8174, 2019 | 73 | 2019 |
Pointing novel objects in image captioning Y Li, T Yao, Y Pan, H Chao, T Mei Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 51 | 2019 |
Learning Deep Intrinsic Video Representation by Exploring Temporal Coherence and Graph Structure. Y Pan, Y Li, T Yao, T Mei, H Li, Y Rui IJCAI, 3832-3838, 2016 | 47 | 2016 |
Exploring category-agnostic clusters for open-set domain adaptation Y Pan, T Yao, Y Li, CW Ngo, T Mei Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020 | 32 | 2020 |
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training Y Pan, Y Li, J Luo, J Xu, T Yao, T Mei arXiv preprint arXiv:2007.02375, 2020 | 29 | 2020 |
Contextual transformer networks for visual recognition Y Li, T Yao, Y Pan, T Mei IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022 | 24 | 2022 |
Scheduled sampling in vision-language pretraining with decoupled encoder-decoder network Y Li, Y Pan, T Yao, J Chen, T Mei arXiv preprint arXiv:2101.11562, 2021 | 19 | 2021 |
Msr asia msm at activitynet challenge 2017: Trimmed action recognition, temporal action proposals and densecaptioning events in videos T Yao, Y Li, Z Qiu, F Long, Y Pan, D Li, T Mei CVPR ActivityNet Challenge Workshop, 2017 | 18 | 2017 |
Share-and-chat: Achieving human-level video commenting by search and multi-view embedding Y Li, T Yao, T Mei, H Chao, Y Rui Proceedings of the 24th ACM international conference on Multimedia, 928-937, 2016 | 15 | 2016 |
Unpaired image captioning with semantic-constrained self-learning H Ben, Y Pan, Y Li, T Yao, R Hong, M Wang, T Mei IEEE Transactions on Multimedia, 2021 | 8 | 2021 |
Deep metric learning with density adaptivity Y Li, T Yao, Y Pan, H Chao, T Mei IEEE Transactions on Multimedia 22 (5), 1285-1297, 2019 | 8 | 2019 |
CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising J Luo, Y Li, Y Pan, T Yao, H Chao, T Mei Proceedings of the 29th ACM International Conference on Multimedia, 5600-5608, 2021 | 6 | 2021 |
Trimmed action recognition, dense-captioning events in videos, and spatio-temporal action localization with focus on activitynet challenge 2019 Z Qiu, D Li, Y Li, Q Cai, Y Pan, T Yao arXiv preprint arXiv:1906.07016, 2019 | 6 | 2019 |