{"@context":"http://iiif.io/api/presentation/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/manifest.json","@type":"sc:Manifest","label":"Ensemble and Multimodal Learning for Anomaly Mining: Algorithms and Applications","metadata":[{"label":"dc.description.sponsorship","value":"This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree."},{"label":"dc.format","value":"Monograph"},{"label":"dc.format.medium","value":"Electronic Resource"},{"label":"dc.identifier.uri","value":"http://hdl.handle.net/11401/78239"},{"label":"dc.language.iso","value":"en_US"},{"label":"dcterms.abstract","value":"Anomaly detection is an important problem that has been studied in a broad spectrum of research areas due to its diverse applications in different domains. There exist many anomaly detection algorithms, among them, some are domain specific and others are more generic. Despite a great amount of advance in this research area, there does not exist a single winner anomaly detector known to work well across different datasets. In fact, designing a single method that is effective on a wide range of domains is a challenging task. Moreover, real-world data consists of multiple and diverse input modalities. Each modality is characterized by very different properties which make it difficult to ignore their differences. This requires designing of a multimodal learning approach by fusing various modalities into a single combined representation. Ensemble techniques for classification and clustering have long proven effective, yet anomaly ensembles have been barely studied. In this dissertation, we tap into this gap and design new ensemble approaches for anomaly mining. Specifically, we design (i) an ensemble approach SELECT which employs novel techniques to systematically select the results from multiple anomaly detectors as well as consensus approaches to assemble, and (ii) a sequential ensemble approach CARE that employs a two-phase aggregation of the intermediate results of base detectors in each iteration to reach the final outcome by reducing both bias and variance. Both the approaches are fully unsupervised as ground truth is scarce in real-world data. We utilize SELECT for event detection in temporal graphs and both the ensemble approaches for outlier detection in multidimensional point data (no-graph). We further improve CARE and develop iCARE, a faster isolation based ensemble approach to be used for massive datasets. Although diverse learning approaches for anomaly mining have been studied for decades, designing multimodal learning approaches for anomaly mining has been researched more recently. In this line of recent works, a useful application of multimodal learning is in opinion spam detection for online review data. We design a new holistic approach called SpEagle that utilizes clues from all metadata (text, timestamp, rating) as well as relational data (review-network), and harness them collectively under a unified framework to spot suspicious users and reviews. Moreover, this method can seamlessly integrate semi-supervision by incorporating labels and achieve improved performance. Furthermore, we improve the SpEagle framework with active inference. We design a method called Expected UnCertainty Reach (EUCR) which is used at each step to pick a node having high uncertainty from a dense region and close to other uncertain nodes. We evaluate our ensembles and multimodal learning approaches on large-scale real-world datasets and they provide improved performance over the existing baselines and state-of-the-art anomaly mining approaches."},{"label":"dcterms.available","value":"2018-06-21T13:38:40Z"},{"label":"dcterms.contributor","value":"Fodor, Paul"},{"label":"dcterms.creator","value":"Rayana, Shebuti"},{"label":"dcterms.dateAccepted","value":"2018-06-21T13:38:40Z"},{"label":"dcterms.dateSubmitted","value":"2018-06-21T13:38:40Z"},{"label":"dcterms.description","value":"Department of Computer Science"},{"label":"dcterms.extent","value":"167 pg."},{"label":"dcterms.format","value":"Monograph"},{"label":"dcterms.identifier","value":"http://hdl.handle.net/11401/78239"},{"label":"dcterms.issued","value":"2017-12-01"},{"label":"dcterms.language","value":"en_US"},{"label":"dcterms.provenance","value":"Made available in DSpace on 2018-06-21T13:38:40Z (GMT). No. of bitstreams: 1\nRayana_grad.sunysb_0771E_13526.pdf: 8788748 bytes, checksum: 350fbe07a9aa5764f66bcd40c4b3f605 (MD5)\n Previous issue date: 12"},{"label":"dcterms.subject","value":"anomaly mining"},{"label":"dcterms.title","value":"Ensemble and Multimodal Learning for Anomaly Mining: Algorithms and Applications"},{"label":"dcterms.type","value":"Dissertation"},{"label":"dc.type","value":"Dissertation"}],"description":"This manifest was generated dynamically","viewingDirection":"left-to-right","sequences":[{"@type":"sc:Sequence","canvases":[{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json","@type":"sc:Canvas","label":"Page 1","height":1650,"width":1275,"images":[{"@type":"oa:Annotation","motivation":"sc:painting","resource":{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/71%2F61%2F78%2F71617857025890799649464377639686314700/full/full/0/default.jpg","@type":"dctypes:Image","format":"image/jpeg","height":1650,"width":1275,"service":{"@context":"http://iiif.io/api/image/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/71%2F61%2F78%2F71617857025890799649464377639686314700","profile":"http://iiif.io/api/image/2/level2.json"}},"on":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json"}]}]}]}