{"@context":"http://iiif.io/api/presentation/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/manifest.json","@type":"sc:Manifest","label":"Statistical Comparison of Measurement Platforms","metadata":[{"label":"dc.description.sponsorship","value":"This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree."},{"label":"dc.format","value":"Monograph"},{"label":"dc.format.medium","value":"Electronic Resource"},{"label":"dc.identifier.uri","value":"http://hdl.handle.net/11401/76571"},{"label":"dc.language.iso","value":"en_US"},{"label":"dc.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.abstract","value":"This thesis proposes a novel statistical method based on the generalized linear errors-in-variables model to compare two measurement platforms with discrete and continuous outcomes respectively. This method overcomes the limitation of the classical platform comparison method with only linear models that can only accommodate two continuous outcome measures. This novel method was applied to model two gene expression measurement platforms: Microarray (continuous) and RNA-Seq (discrete). The comparison result is further validated by differentially expressed gene analysis and biological pathway analysis. The proposed approach would play a significant role: 1) assessing emerging platforms systematically with existing platforms, 2) serving as a foundation to integrate data sets generated from different platforms. In order to perform platform comparison, a model is built between Microarray and RNA-Seq gene expression profiles based on established distribution assumptions for the purpose of estimating fixed and proportional biases. From both biological and technical view, the variation and dispersion in the measured expression profiles are considered to be gene-specific, which means realistic models of whole genome expression profile data sets contains large number of nuisance parameters and each platform would feature only a limited number of replicates because of the high cost to measure sample on both platforms. Consequently, substituting those parameters with their common estimates from the limited replicates in the model's likelihood function is often proven unreliable with large variances. Therefore, directly replacing nuisance parameter with estimates from replicates does not lead to appropriate estimates. Additionally, because the number of parameters in model is often tens of thousands, estimating nuisance parameters through their maximum likelihood estimators (MLE) is no longer feasible considering the computational difficulties. In order to overcome above limitations, we further developed a customized estimation method for the proposed generalized linear errors-in-variables model based on unbiased estimating equations (UEE), which yield estimators in analytical form, in lieu of maximum likelihood estimate. Under suitable distribution assumptions of the platforms, the new estimator is proven, theoretically, to converge to the underlying truth with a small bias, which is due to the inherent low count in the discrete platform. The performance of proposed method's was first evaluated by simulated data sets with modest number (three, five and ten) of replicates and subsequently applied to compare published Microarray and RNA-Seq data sets."},{"label":"dcterms.available","value":"2017-09-20T16:50:40Z"},{"label":"dcterms.contributor","value":"Zhu, Wei"},{"label":"dcterms.creator","value":"Zhang, Yuanhao"},{"label":"dcterms.dateAccepted","value":"2017-09-20T16:50:40Z"},{"label":"dcterms.dateSubmitted","value":"2017-09-20T16:50:40Z"},{"label":"dcterms.description","value":"Department of Applied Mathematics and Statistics."},{"label":"dcterms.extent","value":"100 pg."},{"label":"dcterms.format","value":"Application/PDF"},{"label":"dcterms.identifier","value":"http://hdl.handle.net/11401/76571"},{"label":"dcterms.issued","value":"2015-08-01"},{"label":"dcterms.language","value":"en_US"},{"label":"dcterms.provenance","value":"Made available in DSpace on 2017-09-20T16:50:40Z (GMT). No. of bitstreams: 1\nZhang_grad.sunysb_0771E_12186.pdf: 2596069 bytes, checksum: e2c359b6d07bcb54a1267400f47038d9 (MD5)\n Previous issue date: 2014"},{"label":"dcterms.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.subject","value":"Statistics"},{"label":"dcterms.title","value":"Statistical Comparison of Measurement Platforms"},{"label":"dcterms.type","value":"Dissertation"},{"label":"dc.type","value":"Dissertation"}],"description":"This manifest was generated dynamically","viewingDirection":"left-to-right","sequences":[{"@type":"sc:Sequence","canvases":[{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json","@type":"sc:Canvas","label":"Page 1","height":1650,"width":1275,"images":[{"@type":"oa:Annotation","motivation":"sc:painting","resource":{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/12%2F50%2F74%2F125074133868721138915219074006194009358/full/full/0/default.jpg","@type":"dctypes:Image","format":"image/jpeg","height":1650,"width":1275,"service":{"@context":"http://iiif.io/api/image/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/12%2F50%2F74%2F125074133868721138915219074006194009358","profile":"http://iiif.io/api/image/2/level2.json"}},"on":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json"}]}]}]}