Aligning visual regions and textual concepts for semantic-grounded image representations F Liu, Y Liu, X Ren, X He, X Sun NeurIPS, 2019 | 121 | 2019 |
Rethinking skip connection with layer normalization in transformers and resnets F Liu, X Ren, Z Zhang, X Sun, Y Zou COLING, 2021 | 103 | 2021 |
Non-autoregressive coarse-to-fine video captioning B Yang, Y Zou, F Liu, C Zhang AAAI, 2021 | 79* | 2021 |
End-to-end spoken conversational question answering: Task, dataset and model C You, N Chen, F Liu, S Ge, X Wu, Y Zou NAACL (Findings), 2022 | 56* | 2022 |
Aligning source visual and target language domains for unpaired video captioning F Liu, X Wu, C You, S Ge, Y Zou, X Sun IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, IF=23.6), 2022 | 30 | 2022 |
Zeronlg: Aligning and autoencoding domains for zero-shot multimodal and multilingual natural language generation B Yang*, F Liu*, Y Zou, X Wu, Y Wang, DA Clifton IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, IF=23.6), 2024 | 5 | 2024 |
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels B Yang, F Liu, Z Li, Q Yin, C You, B Yin, Y Zou arXiv preprint arXiv:2307.01969, 2023 | 1 | 2023 |