Under advisor Scott Baden.
Abstract: Numerical simulations of technologically important phenomena can generate large datasets; extracting knowledge from these volumnious data sets is a technical challenge. Consider a common case where the simulation data is stored at the points of a regularly spaced mesh in space and time. Most scientists use ad-hoc methods for their analysis. Some application domains are able to use relational databases for their large datasets, but the relational model isn't appropriate for mainy application classes. Scientific computing analysis algorithms compute aggregation and stencil operations. While relational databases work well with aggregates, they are poorly designed for ranges, especially in more than one dimension. Scientific data, therefore, is usually stored in flat files with easily-calculatable offsets for each point. This essentially requires scientists to deal with their own data serialization instead of specialized (and optimized) software. We want to provide to scientific computing what the relational database has provided to businesses for so many years. The computational database manages on-disk storage for user-defined data types and executes user-defined functions and queries over those types. Furthermore, we are working on automating the optimization of the analysis algorithms for multiple compute targets, including both the host CPU and expansion cards, such as GPUs.
I am working with the Computational Fluid Dynamics group here at UCSD.