Show simple item record

dc.contributor.authorDey, S
dc.contributor.authorDutta, A
dc.contributor.authorGhosh, SK
dc.contributor.authorValveny, E
dc.contributor.authorLladós, J
dc.contributor.authorPal, U
dc.date.accessioned2019-10-15T09:07:45Z
dc.date.issued2019-06-02
dc.description.abstractIn this paper we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. Our architecture also relies on a salient object detection through a supervised LSTM-based visual attention model learned from convolutional features. Both the alignment between the queries and the image and the supervision of the attention on the images are obtained by generalizing the Hungarian Algorithm using different loss functions. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set. We validate the performance of our approach on standard single/multi-object datasets, showing state-of-the art performance in every dataset.en_GB
dc.description.sponsorshipEuropean Union Horizon 2020en_GB
dc.description.sponsorshipCERCA Program of Generalitat de Catalunyaen_GB
dc.identifier.citationVol. 11362, pp. 241 - 255en_GB
dc.identifier.doi10.1007/978-3-030-20890-5_16
dc.identifier.grantnumber665919en_GB
dc.identifier.grantnumberTIN2015-70924-C2-2-Ren_GB
dc.identifier.grantnumberTIN2014-52072-Pen_GB
dc.identifier.urihttp://hdl.handle.net/10871/39196
dc.language.isoenen_GB
dc.publisherSpringer Verlagen_GB
dc.rights© Springer Nature Switzerland AG 2019en_GB
dc.titleAligning Salient Objects to Queries: A Multi-modal and Multi-object Image Retrieval Frameworken_GB
dc.typeConference paperen_GB
dc.date.available2019-10-15T09:07:45Z
dc.identifier.isbn9783030208899
dc.identifier.issn0302-9743
dc.descriptionThis is the author accepted manuscript. The final version is available from Springer Verlag via the DOI in this recorden_GB
dc.descriptionACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, 2-6 December 2018
dc.identifier.journalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)en_GB
dc.rights.urihttp://www.rioxx.net/licenses/all-rights-reserveden_GB
rioxxterms.versionAMen_GB
rioxxterms.licenseref.startdate2019-06-02
rioxxterms.typeConference Paper/Proceeding/Abstracten_GB
refterms.dateFCD2019-10-15T09:03:18Z
refterms.versionFCDAM
refterms.dateFOA2019-10-15T09:07:51Z
refterms.panelBen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record