![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
|
||||
|
Characterizing Flash Memory: Anomalies, Observations and Applications, ,
The 42nd Annual IEEE/ACM International Symposium on Microarchitecture, 2009. McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures, , MICRO '1009 Proceedings of the 2009 42nd IEEE/ACM International Symposium on Microarchitecture, 2009. Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors, , SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, New York, NY, USA, 2009, pages 169-180. Fast switching of threads between cores, , SIGOPS Operating Systems Review 43(2):35-45, 2009. Reducing Peak Power with a Table-Driven Adaptive Processor Core, , MICRO 42: Proceedings of the 42nd annual IEEE/ACM International Symposium on Microarchitecture, New York City, NY, USA, 2009. Creating artificial global history to improve branch prediction accuracy, , ICS '09: Proceedings of the 23rd international conference on Supercomputing, New York, NY, USA, 2009, pages 266-275. Mapping Out a Path from Hardware Transactional Memory to Speculative Multithreading, , PACT '09: Proceedings of the 18th international conference on parallel architectures and compilation techniques, 2009. Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications, , ASPLOS 2009: Proceedings of the 14th international conference on Architectural support for programming languages and operating systems, 2009. (Selected for IEEE Micro "Top Picks" 2009). |
||||
|
Tiled Multicore Processors, ,
Multicore Processors and Systems, 2008. The shared-thread multiprocessor, , ICS '08: Proceedings of the 22nd annual international conference on Supercomputing, New York, NY, USA, 2008, pages 73-82. Accurate branch prediction for short threads, , ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2008, pages 125-134. Compiler Techniques for Reducing Data Cache Miss Rate on a Multithreaded Architecture, , International Conference on High Performance Embedded Architectures amp; Compilers (HiPEAC 2008), January 2008. |
||||
|
Tiled microprocessors, ,
Ph.D. thesis, 2007. FPGA Global Routing Architecture Optimization Using a Multicommodity Flow Approach, , IEEE Int. Conf. on Computer Design:144-151, 2007. Stream Multicore Processors, , Processor Design: System-on-chip Computing for ASICs and FPGAs, 2007. Runtime checking for program verification, , 7th International Workshop Runtime Verification, Revised Selected Papers 4839/2007:202-213, 2007. The Architecture of Efficient Multi-Core Processors: A Holistic Approach, , In Advances in Computers. M.V.Zelkowitz, editor. Academic Press, 2007. Proximity-aware directory-based coherence for multi-core processor architectures, , SPAA '07: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, New York, NY, USA, 2007, pages 126-134. Accelerating and Adapting Precomputation Threads for Effcient Prefetching, , High Performance Computer Architecture, 2007. HPCA 2007. IEEE 13th International Symposium on:85-95, Feb. 2007. Patching Processor Design Errors with Programmable Hardware, , IEEE Micro 27(1):12-25, 2007. A Loop Correlation Technique to Improve Performance Auditing, , PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, Washington, DC, USA, 2007, pages 259-269. Representative Multiprogram Workloads for Multithreaded Processor Simulation, , Workload Characterization, 2007. IISWC 2007. IEEE 10th International Symposium on:193-203, Sept. 2007. Automatically classifying benign and harmful data racesallusing replay analysis, , SIGPLAN Not. 42(6):22-31, 2007. Cross Binary Simulation Points, , Performance Analysis of Systems & Software, 2007. ISPASS 2007. IEEE International Symposium on:179-189, April 2007. Transient Fault Prediction Based on Anomalies in Processor Events, , Design, Automation & Test in Europe Conference & Exhibition, 2007. DATE '07:1-6, April 2007. Accelerating and Adapting Precomputation Threads for Effcient Prefetching, , HPCA '07: Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture, Washington, DC, USA, 2007, pages 85-95. The WaveScalar architecture, , ACM Trans. Comput. Syst. 25(2):4, 2007. |
||||
|
Efficient Sampling Startup for SimPoint, ,
IEEE Micro 26(4):32-42, 2006. A self-repairing prefetcher in an event-driven dynamic optimization framework, , Code Generation and Optimization, 2006. CGO 2006. International Symposium on: 12 pp.-, March 2006. Dynamic Code Value Specialization Using the Trace Cache Fill Unit, , Computer Design, 2006. ICCD 2006. International Conference on:10-16, Oct. 2006. Application-specific customization of parameterized FPGA soft-core processors, , ICCAD '06: Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design, New York, NY, USA, 2006, pages 261-268. Conjoining soft-core FPGA processors, , ICCAD '06: Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design, New York, NY, USA, 2006, pages 694-701. Core architecture optimization for heterogeneous chip multiprocessors, , PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, New York, NY, USA, 2006, pages 23-32. Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors, , Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International: 10 pp.-, April 2006. Processor Power Reduction Via Single-ISA Heterogeneous Multi-Core Architectures, , IEEE Comput. Archit. Lett. 1(1):5-8, 2006. Dynamic Code Value Specialization Using the Trace Cache Fill Unit, , Computer Design, 2006. ICCD 2006. International Conference on:10-16, Oct. 2006. Efficient Sampling Startup for SimPoint, , Micro, IEEE 26(4): 32-42, July-Aug. 2006. Online performance auditing: using hot optimizations without getting burned, , SIGPLAN Not. 41(6):239-251, 2006. Automatic logging of operating system effects to guide application-level architecture simulation, , SIGMETRICS '06/Performance '06: Proceedings of the joint international conference on Measurement and modeling of computer systems, New York, NY, USA, 2006, pages 216-227. Using Machine Learning to Guide Architecture Simulation, , J. Mach. Learn. Res. 7:343-378, 2006. Detecting phases in parallel applications on shared memory architectures, , Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International: 10 pp.-, April 2006. Comparing multinomial and k-means clustering for SimPoint, , Performance Analysis of Systems and Software, 2006 IEEE International Symposium on: 131-142, March 2006. Considering all starting points for simultaneous multithreading simulation, , Performance Analysis of Systems and Software, 2006 IEEE International Symposium on: 143-153, March 2006. Selecting software phase markers with code structure analysis, , Code Generation and Optimization, 2006. CGO 2006. International Symposium on: 12 pp.-, March 2006. A Self-Repairing Prefetcher in an Event-Driven Dynamic Optimization Framework, , CGO '06: Proceedings of the International Symposium on Code Generation and Optimization, Washington, DC, USA, 2006, pages 50-64. Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers, , MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2006, pages 235-246. Unbounded page-based transactional memory, , SIGPLAN Not. 41(11):347-358, 2006. Recording shared memory dependencies using strata, , SIGARCH Comput. Archit. News 34(5):229-240, 2006. Area-Performance Trade-offs in Tiled Dataflow Architectures, , ISCA '06: Proceedings of the 33rd annual international symposium on Computer Architecture, Washington, DC, USA, 2006, pages 314-326. Reducing control overhead in dataflow architectures, , PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, 2006, pages 182-191. Modeling instruction placement on a spatial architecture, , SPAA '06: Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures, 2006, pages 158-169. Instruction scheduling for a tiled dataflow architecture, , ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, 2006, pages 141-150. |
||||
|
Dynamic phase analysis for cycle-close trace generation, ,
CODES+ISSS '05: Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, New York, NY, USA, 2005, pages 321-326. An Event-Driven Multithreaded Dynamic Optimization Framework, , PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, 2005, pages 87-98. Variational Path Profiling, , PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, 2005, pages 7-16. The entropia virtual machine for desktop grids, , VEE '05: Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments, New York, NY, USA, 2005, pages 186-196. BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging, , SIGARCH Comput. Archit. News 33(2):284-295, 2005. A Dependency Chain Clustered Microarchitecture, , IPDPS '05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers, Washington, DC, USA, 2005, pages 21.2. Scalar Operand Networks, , IEEE Trans. Parallel Distrib. Syst. 16(2):145-162, 2005. Heterogeneous chip multiprocessors, , Computer 38(11): 32-38, Nov. 2005. An event-driven multithreaded dynamic optimization framework, , Parallel Architectures and Compilation Techniques, 2005. PACT 2005. 14th International Conference on: 87-98, Sept. 2005. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices, , PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, New York, NY, USA, 2005, pages 269-279. Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling, , SIGARCH Comput. Archit. News 33(2):408-419, 2005. A Tree Based Router Search Engine Architecture with Single Port Memories, , SIGARCH Comput. Archit. News 33(2):123-133, 2005. Multithreaded Value Prediction, , HPCA '05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, 2005, pages 5-15. Architecture-Level Power Optimizations -- What Are the Limits?, , Journal of Instruction Level Parallelism, 7 (2005), 2005, pages 1-20. The Danger of Interval-Based Power Efficiency Metrics: When Worst Is Best, , IEEE Comput. Archit. Lett. 4(1):1, 2005. The Strong correlation Between Code Signatures and Performance, , ISPASS '05: Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005, Washington, DC, USA, 2005, pages 236-247. Motivation for Variable Length Intervals and Hierarchical Phase Behavior, , ispass:135-146, 2005. Transition Phase Classification and Prediction, , hpca 00:278-289, 2005. The Microarchitecture of a Pipelined WaveScalar Processor: An RTL-Based study, , Unviersity of Washington Computer Science amp; Engineering technical report TR-2005-11-02, 2005. Balancing Parallelism and Sequentiality in Dataflow Models: Wave-ordered Memory, , Unviersity of Washington Computer Science amp; Engineering technical report TR-2005-10-03, 2005. |
||||
|
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams, ,
ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, Washington, DC, USA, 2004, page 2. Control Flow Optimization Via Dynamic Reconvergence Prediction, , MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2004, pages 129-140. Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy, , MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2004, pages 183-194. Conjoined-Core Chip Multiprocessing, , MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2004, pages 195-206. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance, , SIGARCH Comput. Archit. News 32(2):64, 2004. Hardware and Binary Modification Support for Code Pointer Protection From Buffer Overflow, , MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2004, pages 209-220. Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy, , MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2004, pages 183-194. Balancing design options with Sherpa, , CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems, New York, NY, USA, 2004, pages 57-68. BitRaker Anvil: Binary Instrumentation for Rapid Creation of Simulation and Workload Analysis Tools, , Proc. of the 2004 Global signal Processing Expo (GSPx), Santa Clara, CA, USA, September 2004. How to use SimPoint to pick simulation points, , SIGMETRICS Perform. Eval. Rev. 31(4):25-30, 2004. Using a serial cache for energy efficient instruction fetching, , J. Syst. Archit. 50(11):675-685, 2004. Structures for phase classification, , Performance Analysis of Systems and Software, 2004 IEEE International Symposium on - ISPASS: 57-67, 2004. A co-phase matrix to guide simultaneous multithreading simulation, , Performance Analysis of Systems and Software, 2004 IEEE International Symposium on - ISPASS: 45-56, 2004. Deterministic memory-efficient string matching algorithms for intrusion detection, , INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies 4: 2628-2639 vol.4, March 2004. Creating Converged Trace Schedules Using String Matching, , HPCA '04: Proceedings of the 10th International Symposium on High Performance Computer Architecture, Washington, DC, USA, 2004, page 210. Clustered Multithreaded Architectures — Pursuing both IPC and Cycle Time, , ipdps 01:76b, 2004. System support for pervasive applications, , ACM Trans. Comput. Syst. 22(4):421-486, 2004. The Death of ILP, , ASPLOS XI Wild and Crazy Idea Session, 2004. |
||||
|
Discovering and exploiting program phases, ,
Micro, IEEE 23(6): 84-93, Nov.-Dec. 2003. Reducing code size with echo instructions, , CASES '03: Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems, New York, NY, USA, 2003, pages 84-94. Picking Statistically Valid and Early Simulation Points, , PACT '03: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, 2003, page 244. Using SimPoint for accurate and efficient simulation, , SIGMETRICS Perform. Eval. Rev. 31(1):318-319, 2003. A pipelined memory architecture for high throughput network processors, , SIGARCH Comput. Archit. News 31(2):288-299, 2003. Phase tracking and prediction, , SIGARCH Comput. Archit. News 31(2):336-349, 2003. Predicate prediction for efficient out-of-order execution, , ICS '03: Proceedings of the 17th annual international conference on Supercomputing, New York, NY, USA, 2003, pages 183-192. Phi-Predication for light-weight if-conversion, , CGO '03: Proceedings of the international symposium on Code generation and optimization, Washington, DC, USA, 2003, pages 179-190. A Decoupled Predictor-Directed Stream Prefetching Architecture, , IEEE Trans. Comput. 52(3):260-276, 2003. Entropia: architecture and performance of an enterprise desktop grid system, , J. Parallel Distrib. Comput. 63(5):597-610, 2003. Catching accurate profiles in hardware, , High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. The Ninth International Symposium on: 269-280, Feb. 2003. Incorporating predicate information into branch predictors, , High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. The Ninth International Symposium on: 53-64, Feb. 2003. Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures, , HPCA '03: Proceedings of the 9th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, 2003, page 341. A 16-issue multiple-program-counter microprocessor with point-to-point scalar operand network, , Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC. 2003 IEEE International: 170-171 vol.1, 2003. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction, , MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2003, page 81. Initial Observations of the Simultaneous Multithreading Pentium 4 Processor, , PACT '03: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, 2003, page 26. The Effect of Compiler Optimizations on Pentium 4 Power Consumption, , INTERACT '03: Proceedings of the Seventh Workshop on Interaction between Compilers and Computer Architectures, Washington, DC, USA, 2003, page 51. Exploring the Potential of Architecture-Level Power Optimizations, , PACS, 2003, pages 132-147. A multi-core approach to addressing the energy-complexity problem in microprocessors, , , 2003. An evaluation of speculative instruction execution on simultaneous multithreaded processors, , ACM Trans. Comput. Syst. 21(3):314-340, 2003. Measuring the Complexity-effectiveness of Future-generation Silicon Architectures using FPGAs: A Status Report, , Workshop on Complexity-effective Design, June 2003. Dataflow: The Road Less Complex, , Workshop on Complexity-effective Design, 2003. |
||||
|
Pointer cache assisted prefetching, ,
Microarchitecture, 2002. (MICRO-35). Proceedings. 35th Annual
IEEE/ACM International Symposium on: 62-73, 2002. Symbiotic jobscheduling with priorities for a simultaneous multithreading processor, , SIGMETRICS '02: Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, New York, NY, USA, 2002, pages 66-76. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs, , IEEE Micro 22(2):25-35, 2002. Automatically characterizing large scale program behavior, , SIGARCH Comput. Archit. News 30(5):45-57, 2002. Quantifying Instruction Criticality, , PACT '02: Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, 2002, page 104. Using predicate path information in hardware to determine true dependences, , ICS '02: Proceedings of the 16th international conference on Supercomputing, New York, NY, USA, 2002, pages 230-240. An EPIC Processor with Pending Functional Units, , ISHPC '02: Proceedings of the 4th International Symposium on High Performance Computing, London, UK, 2002, pages 310-320. Quantifying Load Stream Behavior, , HPCA '02: Proceedings of the 8th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, 2002, page 197. Pointer cache assisted prefetching, , MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, Los Alamitos, CA, USA, 2002, pages 62-73. Compiling for instruction cache performance on a multithreaded architecture, , MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, Los Alamitos, CA, USA, 2002, pages 419-429. Quantifying Instruction Criticality, , pact 00:104, 2002. Symbiotic jobscheduling with priorities for a simultaneous multithreading processor, , SIGMETRICS '02: Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, New York, NY, USA, 2002, pages 66-76. Configuration by Combustion: Online Simulated Annealing for Dynamic Hardware Configuration, , ASPLOS X Wild and Crazy Idea Session, 2002. Towards a Universal Building Block of Molecular and Silicon Computation, , Workshop on Non-Silicon Computing, 2002. |
||||
|
Handling long-latency loads in a simultaneous multithreading
processor, ,
MICRO 34: Proceedings of the 34th annual ACM/IEEE
international symposium on Microarchitecture, Washington, DC, USA, 2001, pages 318-327. Speculative precomputation: long-range prefetching of delinquent loads, , ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture, New York, NY, USA, 2001, pages 14-25. Dynamic prediction of critical path instructions, , High-Performance Computer Architecture, 2001. HPCA. The Seventh International Symposium on:185-195, 2001. Automated design of finite state machine predictors for customized processors, , ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture, New York, NY, USA, 2001, pages 86-97. Patchable Instruction ROM Architecture, , technical report , 2001. Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications, , PACT '01: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, 2001, pages 3-14. Reducing delay with dynamic selection of compression formats, , High Performance Distributed Computing, 2001. Proceedings. 10th IEEE International Symposium on:266-277, 2001. Automated design of finite state machine predictors for customized processors, , ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture, New York, NY, USA, 2001, pages 86-97. Optimizations Enabled by a Decoupled Front-End Architecture, , IEEE Trans. Comput. 50(4):338-355, 2001. Reducing the Overhead of Dynamic Compilation, , Software: Practice and Experience 31:717-738, march 2001. Dynamic Prediction of the Critical Dependence Path, , 7th International Symposium On High Performance Computer Architecture, January 2001. Dynamic speculative precomputation, , MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, Washington, DC, USA, 2001, pages 306-317. Reducing power with dynamic critical path information, , MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, Washington, DC, USA, 2001, pages 114-123. System-level Programming Abstractions for Ubiquitous Computing, , Workshop on Application Models and Programming Tools for Ubiquitous Computing, 2001. Programming for Pervasive Computing Environments, , Unviersity of Washington Computer Science amp; Engineering technical report UW-CSE-01-06-01, 2001. Systems Directions for Pervasive Computing, , Proceedings of the 8th Workshop on Hot Topics in Operating Systems, 2001. |
||||
|
Power-sensitive multithreaded architecture, ,
Computer Design, 2000. Proceedings. 2000 International Conference
on:199-206, 2000. Limits of task-based parallelism in irregular applications, , SIGARCH Comput. Archit. News 28(1):20-20, 2000. Predictor-directed stream buffers, , MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, New York, NY, USA, 2000, pages 42-53. Path Analysis and Renaming for Predicated Instruction Scheduling, , Int. J. Parallel Program. 28(6):563-588, 2000. Loop Termination Prediction, , ISHPC '00: Proceedings of the Third International Symposium on High Performance Computing, London, UK, 2000, pages 73-87. Limits of task-based parallelism in irregular applications, , SIGARCH Comput. Archit. News 28(1):20-20, 2000. ToolBlocks: An Infrastructure for the Construction of Memory Hierarchy Analysis Tools (Research Note), , Euro-Par '00: Proceedings from the 6th International Euro-Par Conference on Parallel Processing, London, UK, 2000, pages 70-74. A Comparative Survey of Load Speculation Architectures, , , 2000. Using Annotations to Reduce Dynamic Optimization Time, , technical report , 2000. Scheduling Classes on a College Campus, , Computational Optimization and Applications 16(3), 2000. |
||||
|
Hardware identification of cache conflict misses, ,
MICRO 32: Proceedings of the 32nd annual ACM/IEEE
international symposium on Microarchitecture, Washington, DC, USA, 1999, pages 126-135. ILP versus TLP on SMT, , Supercomputing '99: Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), New York, NY, USA, 1999, page 37. Classifying load and store instructions for memory renaming, , ICS '99: Proceedings of the 13th international conference on Supercomputing, New York, NY, USA, 1999, pages 399-407. Storageless value prediction using prior register values, , SIGARCH Comput. Archit. News 27(2):270-279, 1999. Selective value prediction, , ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture, Washington, DC, USA, 1999, pages 64-74. Software-Directed Register Deallocation for Simultaneous Multithreaded Processors, , IEEE Trans. Parallel Distrib. Syst. 10(9):922-933, 1999. Instruction Recycling on a Multiple-Path Processor, , HPCA '99: Proceedings of the 5th International Symposium on High Performance Computer Architecture, Washington, DC, USA, 1999, page 44. Supporting Fine-Grained Synchronization on a Simultaneous Multithreading Processor, , HPCA '99: Proceedings of the 5th International Symposium on High Performance Computer Architecture, Washington, DC, USA, 1999, page 54. A Scalable Front-End Architecture for Fast Instruction Delivery, , isca 00:0234, 1999. The precomputed-branch architecture: efficient branches with compiler support, , J. Syst. Archit. 45(9):651-679, 1999. A comparison of software code reordering and victim buffers, , SIGARCH Comput. Archit. News 27(1):51-54, 1999. Fetch directed instruction prefetching, , MICRO 32: Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture, Washington, DC, USA, 1999, pages 16-27. Reducing transfer delay using Java class file splitting and prefetching, , SIGPLAN Not. 34(10):276-291, 1999. Predicated Static Single Assignment, , PACT '99: Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, 1999, page 245. Time Varying Behavior of Programs , , technical report , 1999. Reducing cache misses using hardware and software page placement, , ICS '99: Proceedings of the 13th international conference on Supercomputing, New York, NY, USA, 1999, pages 155-164. Classifying load and store instructions for memory renaming, , ICS '99: Proceedings of the 13th international conference on Supercomputing, New York, NY, USA, 1999, pages 399-407. General Techniques for Multithreading Algorithms, , Proceedings of 1999 International Conference on Parallel and Distributed Techniques and Algorithms, 1999. |
||||
| 9500 Gilman Drive, La Jolla, CA 92093-0114 |
| About CSE |
CSE People |
Faculty & Research |
Graduate Education |
Undergraduate Education Department Administration | Contact CSE | Help | Search | Site map | Home webmaster@cse.ucsd.edu |
| Copyright © 2003 Regents of the University of California. All rights reserved. |