Modeling Salient Object-Object Interactions to Generate Textual Descriptions for Natural Images

Adeli, Hossein

Modeling Salient Object-Object Interactions to Generate Textual Descriptions for Natural Images

dc.contributor.advisor	Tabrizi, M. H. N.	en_US
dc.contributor.author	Adeli, Hossein	en_US
dc.contributor.department	Computer Science	en_US
dc.date.accessioned	2012-09-04T18:08:17Z
dc.date.available	2014-10-01T14:45:54Z
dc.date.issued	2012	en_US
dc.description.abstract	In this thesis we consider the problem of automatically generating textual descriptions of images which is useful in many applications. For example, searching and retrieving visual data in overwhelming number of images and videos available on the Internet requires better understanding of the multimedia content that is not provided by user annotated tags and meta-data. While this task remains a very challenging problem for machines, humans can easily generate concise descriptions of the images; they can avoid what seems to be unnecessary and not related to the main point of the images and talk about the objects, their actions and attributes, their interactions with each other and the context that all is happening. Our method consists of two main steps to automatically generate the image description. By using saliency maps and object detectors, it determines the objects that are of interests to the observer and hence, should appear in the description of the image. Then pose (body part configuration) of those objects/entities is used to recognize the single actions and interactions between them. For generating the sentences, we use a syntactic model that first orders the nouns (objects) and then builds sub-trees around the detected objects using the predicted actions. The model then combines those sub-trees using the recognized interactions and at the end, the context of interactions, which is detected with a separate algorithm, is added to create a full sentence for the image. The results show the improved accuracy of the descriptions generated, using our method.	en_US
dc.description.degree	M.S.	en_US
dc.format.extent	51 p.	en_US
dc.format.medium	dissertations, academic	en_US
dc.identifier.uri	http://hdl.handle.net/10342/3935
dc.language.iso		en_US
dc.publisher	East Carolina University	en_US
dc.subject	Computer science	en_US
dc.subject	Artificial intelligence	en_US
dc.subject	Image processing	en_US
dc.subject	Image understanding	en_US
dc.subject	Sentence generation	en_US
dc.subject.lcsh	Imaging systems
dc.subject.lcsh	Image analysis
dc.subject.lcsh	Metadata
dc.title	Modeling Salient Object-Object Interactions to Generate Textual Descriptions for Natural Images	en_US
dc.type	Master's Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: AdeliJelodar_ecu_0600M_10757.pdf
Size:: 1017.92 KB
Format:: Adobe Portable Document Format

Please login to access this content.

Download

Collections

Computer Science
Master's Theses