{"@context":"http://iiif.io/api/presentation/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/manifest.json","@type":"sc:Manifest","label":"Automatic Synthesis of Instruction Set Semantics and its Applications","metadata":[{"label":"dc.description.sponsorship","value":"This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree."},{"label":"dc.format","value":"Monograph"},{"label":"dc.format.medium","value":"Electronic Resource"},{"label":"dc.identifier.uri","value":"http://hdl.handle.net/11401/77263"},{"label":"dc.language.iso","value":"en_US"},{"label":"dc.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.abstract","value":"Binary analysis, translation and instrumentation tools play an important role in software security. To support binaries for different processors, it is necessary to incorporate the semantics of every processor's instruction set into the tool. Unfortunately, the complexity of modern instruction sets makes the common approach of manual semantics modeling cumbersome and error-prone. Furthermore, it limits the number of processors as well as the fraction of the instruction set that is supported. In this dissertation, we propose novel architecture-neutral techniques for automatically synthesizing the semantics of instruction sets. Our approach relies on the observation that modern compilers such as GCC and LLVM already contain detailed knowledge about the semantics of many instruction sets. We therefore develop two techniques for extracting this knowledge. Our first technique relies on a learning process: observing examples of translation between a compiler's architecture-neutral internal representation and machine instructions, and inferring the mapping from these examples. We then develop a second (and complementary) method that develops symbolic execution techniques to extract this mapping from the code generator source. Unlike previous symbolic execution systems that specialize in generating a single solution to a set of constraints, our problem requires a compact representation of all possible solutions. We describe the development of such a system, based on source-to-source transformation of C-code and a runtime system that is implemented in C and Prolog with a finite-domain constraint solver (CLP-FD). To demonstrate the applicability of synthesized instruction-set semantics, we develop two applications. In the first application, we use synthesized semantics to test correctness of code generators. Specifically, we develop a new testing approach that generates and executes test cases based on the derived semantic model for each instruction. We uncovered nontrivial bugs in the GCC code generator using this technique. As a second application, we have used these models to lift binaries for x86, ARM and AVR (used in Arduino and other microcontroller) architectures to intermediate code, which can then be analyzed or instrumented in an architecture-independent manner."},{"label":"dcterms.available","value":"2017-09-20T16:52:19Z"},{"label":"dcterms.contributor","value":"Srinivas, Suresh."},{"label":"dcterms.creator","value":"Hasabnis, Niranjan"},{"label":"dcterms.dateAccepted","value":"2017-09-20T16:52:19Z"},{"label":"dcterms.dateSubmitted","value":"2017-09-20T16:52:19Z"},{"label":"dcterms.description","value":"Department of Computer Science."},{"label":"dcterms.extent","value":"198 pg."},{"label":"dcterms.format","value":"Monograph"},{"label":"dcterms.identifier","value":"http://hdl.handle.net/11401/77263"},{"label":"dcterms.issued","value":"2015-05-01"},{"label":"dcterms.language","value":"en_US"},{"label":"dcterms.provenance","value":"Made available in DSpace on 2017-09-20T16:52:19Z (GMT). No. of bitstreams: 1\nHasabnis_grad.sunysb_0771E_12478.pdf: 1046310 bytes, checksum: e0c319564bfe668953a84ca42caa716b (MD5)\n Previous issue date: 2015"},{"label":"dcterms.publisher","value":"The Graduate School, Stony Brook University: Stony Brook, NY."},{"label":"dcterms.subject","value":"Computer science"},{"label":"dcterms.title","value":"Automatic Synthesis of Instruction Set Semantics and its Applications"},{"label":"dcterms.type","value":"Dissertation"},{"label":"dc.type","value":"Dissertation"}],"description":"This manifest was generated dynamically","viewingDirection":"left-to-right","sequences":[{"@type":"sc:Sequence","canvases":[{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json","@type":"sc:Canvas","label":"Page 1","height":1650,"width":1275,"images":[{"@type":"oa:Annotation","motivation":"sc:painting","resource":{"@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/91%2F42%2F13%2F91421358224251668679157863737316954317/full/full/0/default.jpg","@type":"dctypes:Image","format":"image/jpeg","height":1650,"width":1275,"service":{"@context":"http://iiif.io/api/image/2/context.json","@id":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/91%2F42%2F13%2F91421358224251668679157863737316954317","profile":"http://iiif.io/api/image/2/level2.json"}},"on":"https://repo.library.stonybrook.edu/cantaloupe/iiif/2/canvas/page-1.json"}]}]}]}