Project Period
June 1, 2007-December 31, 2010
Level of Access
Open-Access Report
Grant Number
0702748
Submission Date
4-5-2011
Abstract
As the size of the data sets manipulated by data-intensive scientific applications approaches the petabyte level and beyond, the need for scalable I/O techniques becomes increasingly important and difficult. Much of the research on this issue has been performed within the context of
MPI-IO: the de-facto standard parallel I/O interface for data-intensive applications. Its popularity stems from the fact that MPI-IO provides to applications a rich and flexile parallel I/O API coupled with highly efficient implementations of this API. This problem is being further addressed by the development of powerful parallel I/O subsystems, and state-of-the-art file systems that can efficiently access this infrastructure. However, even with such advances, I/O continues to be a significant bottleneck in application performance.
The goal of this research is to provide high-performance I/O for data-intensive applications. A key insight is that a major obstacle in the way of this goal is the legacy view of a file as a linear sequence of bytes. This is because scientific applications rarely access data in a way that matches this file model, using instead what is more accurately described as an object model. In fact, it is the runtime translation between these two data models that is a major contributor to poor I/O performance. To address this issue, this research will develop a more powerful object-based file model for MPI applications, and an object-based caching system to serve as an interface between MPI applications and object-based files. Objects will be carefully defined to encapsulate information about an application's I/O access patterns, and such information will be used to increase the parallelism of file accesses and decrease the cost of maintaining global cache coherence.
Rights and Access Note
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. In addition, no permission is required from the rights-holder(s) for educational uses. For other uses, you need to obtain permission from the rights-holder(s).
Recommended Citation
Dickens, Phillip M., "Object-Based Caching for MPI-IO" (2011). University of Maine Office of Research Administration: Grant Reports. 301.
https://digitalcommons.library.umaine.edu/orsp_reports/301
Additional Participants
Graduate Student
Jeremy Logan
Joshua Murphy
Undergraduate Student
Julius Henderson
Tristan Deane
William Lamond