{"@context":"http://iiif.io/api/presentation/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/manifest.json","@type":"sc:Manifest","label":"Multilingual Named Entity Recognition","metadata":[{"label":"dc.description.sponsorship","value":"This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree."},{"label":"dc.format","value":"Monograph"},{"label":"dc.format.medium","value":"Electronic Resource"},{"label":"dc.identifier.uri","value":"http://hdl.handle.net/11401/77292"},{"label":"dc.language.iso","value":"en_US"},{"label":"dc.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.abstract","value":"With the massive amounts of unannotated text available from myriad sources, learning representations useful for natural language processing(NLP) tasks is an increasingly popular research area. Using deep learning techniques, we learn distributed representations for words (word embeddings) using Wikipedia as the source of text for 40 languages. These distributed representations represent each word as a point in feature space and capture useful semantic and syntactic properties of words amd have been shown to be useful in NLP Tasks like Part of Speech Tagging(POS) etc. We have built 2 classes of word embeddings namely Polyglot and Skipgram for these languages. We build a named entity recognition (NER) system that supports 40 languages using the word embeddings we have generated as features and seek to use freely available Wikipedia text as training data. This involves training language models to obtain the word embeddings, understanding the properties of the learnt word embeddings and culminates in learning models for named entity classification. We also present a novel technique for evaluating our performance on the myriad languages for which no gold data set for testing exists. Our results demonstrate that word embeddings exhibit nice community structure and can be used effectively for NER with no explicit hand crafted feature engineering and perform competitively with existing baselines when coupled with simple language agnostic techniques."},{"label":"dcterms.available","value":"2017-09-20T16:52:21Z"},{"label":"dcterms.contributor","value":"Ramakrishnan, I.V."},{"label":"dcterms.creator","value":"Kulkarni, Vivek V."},{"label":"dcterms.dateAccepted","value":"2017-09-20T16:52:21Z"},{"label":"dcterms.dateSubmitted","value":"2017-09-20T16:52:21Z"},{"label":"dcterms.description","value":"Department of Computer Science."},{"label":"dcterms.extent","value":"54 pg."},{"label":"dcterms.format","value":"Application/PDF"},{"label":"dcterms.identifier","value":"http://hdl.handle.net/11401/77292"},{"label":"dcterms.issued","value":"2014-12-01"},{"label":"dcterms.language","value":"en_US"},{"label":"dcterms.provenance","value":"Made available in DSpace on 2017-09-20T16:52:21Z (GMT). No. of bitstreams: 1\nKulkarni_grad.sunysb_0771M_11763.pdf: 3131809 bytes, checksum: 5416190cdda412864b5ebb804ccc6964 (MD5)\n Previous issue date: 1"},{"label":"dcterms.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.subject","value":"Computer science"},{"label":"dcterms.title","value":"Multilingual Named Entity Recognition"},{"label":"dcterms.type","value":"Thesis"},{"label":"dc.type","value":"Thesis"}],"description":"This manifest was generated dynamically","viewingDirection":"left-to-right","sequences":[{"@type":"sc:Sequence","canvases":[{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json","@type":"sc:Canvas","label":"Page 1","height":1650,"width":1275,"images":[{"@type":"oa:Annotation","motivation":"sc:painting","resource":{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/15%2F03%2F02%2F150302315563949529239162818962691084656/full/full/0/default.jpg","@type":"dctypes:Image","format":"image/jpeg","height":1650,"width":1275,"service":{"@context":"http://iiif.io/api/image/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/15%2F03%2F02%2F150302315563949529239162818962691084656","profile":"http://iiif.io/api/image/2/level2.json"}},"on":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json"}]}]}]}