Abstract: Image captioning is a multimodal task combining computer vision (CV) and natural language processing (NLP). Contrastive language image pre-training has made significant progress by providing ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results