{"@context":"http://iiif.io/api/presentation/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/manifest.json","@type":"sc:Manifest","label":"Composing Image Descriptions in Natural Language","metadata":[{"label":"dc.description.sponsorship","value":"This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree."},{"label":"dc.format","value":"Monograph"},{"label":"dc.format.medium","value":"Electronic Resource"},{"label":"dc.identifier.uri","value":"http://hdl.handle.net/11401/77293"},{"label":"dc.language.iso","value":"en_US"},{"label":"dc.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.abstract","value":"We study the task of image description generation, which can find applications in image search, web accessibility research, story illustration, etc. Rather than concentrating on precise but robotic descriptions, we aim to generate captions, which are human-like, but which are still relevant to the image content. Human generated text is nontrivial in structure and vocabulary. A purely bottom-up approach, relying only on vision detection vocabulary, would struggle to generate such a description as &quot; A cute squirrel having a feast under a tree&quot; . To generate descriptions, which are close to human-like in their complexity and richness, we exploit a vast amount of human-written text available on the Internet and use a dataset of images associated with their captions written by users the web-site Flickr. Based on various aspects of the target image, we collect a set of matching images. From the human-written captions of the obtained images we elicit candidate phrases associated with the matching aspects. We selectively glue together extracted phrases into plausible descriptions, using linguistic patterns and parse tree structure. We tackle this non-trivial task by modeling it as an Integer Linear Programming problem and introducing a novel tree-driven phrase composition framework. As an optional preprocessing step to the generation process, we introduce the task of image caption generalization, the aim of which is to remove extraneous information from image captions written by Flickr users. Evaluation results show that, when using generalized captions as a new source of candidate phrases, we are able to generate descriptions of a better quality in terms of relevance, whilst achieving expressiveness and linguistic sophistication of the resulting output."},{"label":"dcterms.available","value":"2017-09-20T16:52:22Z"},{"label":"dcterms.contributor","value":"Fodor, Paul"},{"label":"dcterms.creator","value":"Kuznetsova, Polina"},{"label":"dcterms.dateAccepted","value":"2017-09-20T16:52:22Z"},{"label":"dcterms.dateSubmitted","value":"2017-09-20T16:52:22Z"},{"label":"dcterms.description","value":"Department of Computer Science."},{"label":"dcterms.extent","value":"146 pg."},{"label":"dcterms.format","value":"Monograph"},{"label":"dcterms.identifier","value":"http://hdl.handle.net/11401/77293"},{"label":"dcterms.issued","value":"2015-08-01"},{"label":"dcterms.language","value":"en_US"},{"label":"dcterms.provenance","value":"Made available in DSpace on 2017-09-20T16:52:22Z (GMT). No. of bitstreams: 1\nKuznetsova_grad.sunysb_0771E_12042.pdf: 79835138 bytes, checksum: ce218c6e155d7dd59b4af11b69e42573 (MD5)\n  Previous issue date: 2014"},{"label":"dcterms.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.subject","value":"image descriptions, natural language generation, natural language processing"},{"label":"dcterms.title","value":"Composing Image Descriptions in Natural Language"},{"label":"dcterms.type","value":"Dissertation"},{"label":"dc.type","value":"Dissertation"}],"description":"This manifest was generated dynamically","viewingDirection":"left-to-right","sequences":[{"@type":"sc:Sequence","canvases":[{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json","@type":"sc:Canvas","label":"Page 1","height":1650,"width":1275,"images":[{"@type":"oa:Annotation","motivation":"sc:painting","resource":{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/46%2F99%2F82%2F4699829237542725595591641211984410279/full/full/0/default.jpg","@type":"dctypes:Image","format":"image/jpeg","height":1650,"width":1275,"service":{"@context":"http://iiif.io/api/image/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/46%2F99%2F82%2F4699829237542725595591641211984410279","profile":"http://iiif.io/api/image/2/level2.json"}},"on":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json"}]}]}]}