Storage Resource Manager

Storage Resource Manager

The Storage Resource Management (SRM) technology was initiated by the Scientific Data Management Group at LBNL and developed in response to growing needs of managing large datasets on a variety of storage systems. Dynamic storage management is essential to ensure
(i) prevention of data loss,
(ii) decrease of error rates of data replication, and
(iii) decrease of the analysis time by ensuring that analysis tasks have the storage space to run to completion.

There are already numerous examples where data from simulations running on leadership class machines were lost because they were not moved in time to a mass storage system. Storage Resource Managers (SRMs) address such issues by coordinating storage allocation, streaming the data between sites, and enforcing secure interfaces to the storage systems (i.e. dealing with special security requirements of each storage system at its home institution.) For example, in a production environment, using SRMs has reduced error rates of large-scale replication from 1% to 0.02% in the STAR project. Furthermore, SRMs can prevent job failures. When running jobs on clusters some of the local disks get filled before the job finishes, resulting in loss of productivity, and therefore a delay in analysis. This occurs because space was not dynamically allocated and previous unneeded files were not removed. While there are tools for dynamically allocating compute and network resources, SRMs are the only tool available for providing dynamic space reservation, guaranteeing secure file availability with lifetime support, and automatic garbage collection that prevents clogging of storage systems.

The SRM specification has evolved into an international de facto standard, and many projects have committed to use this technology, especially in the HEP and HENP communities, such as the World-wide Large Hadron Collider (LHC) Computing Grid (WLCG) that supports ATLAS and CMS. The SRM approach is to develop a uniform standard interface that allows multiple implementations by various institutions to interoperate. This approach removes the dependence on a single implementation, and permits multiple groups to develop SRM systems for their specific storage resources. This approach became crucial to the interoperation of storage systems for such large scale projects that have to manage and distribute massive amounts of data efficiently and securely. Without such a unifying technology, such projects cannot scale, and are bound to fail. This problem will only grow over time as computing facilities move into the petascale regime.

Another important problem that SRMs address is storage clogging. Storage clogging is a critical problem for large scale shared storage systems, since the removal of files after they are used is not automated. This increases the cost of storage, and slows the analysis and discovery process. SRMs help unclog temporary storage systems, by providing lifetime management of accessed files. This capability is crucial to efficient usage of storage under cost constraints.

SRMs also serve as gateways to secure data access. By limiting external access to all storage systems through a standard SRM interface, one can assure not only authenticated access, but also the enforcement of authorized access to files. The SRM technology was highly successful in SciDAC-1, and is currently used in production in several large collaborations. SRM implementations that interoperate have been developed at LBNL, FNAL and TJNAF, as well as several sites in Europe. Furthermore, this technology increases the scientist’s productivity by eliminating the tedious and time consuming tasks of managing storage, performing robust data movement, and dealing with security requirements at various storage sites.

In addition to leading the SRM standard development by coordinating with multiple institutions, the LBNL team has developed SRM systems to disk storage and mass storage systems, including HPSS. These SRMs have been used in several application domains, including multiple projects at the SDM center, Earth System Grid, the STAR experiment, and the Open Science Grid (OSG). As data sets continue to grow and become ever more complex, these projects depend on the continued development and support of the SRM implementations from LBNL. It is essential to capitalize on the SciDAC-1 successes and sustain current projects that depend on the SRM technology, further improving and deploying SRMs in additional projects and application domains, and continued evolution of the SRM standard. Specifically, based on past experience, we have identified important features that require further development and coordination. These include sophisticated aspects of resource monitoring that can be used for performance estimation, authorization enforcement, and accounting tracking and reporting for the purpose of enforcing quota usage in SRMs. Another aspect that needs further development is SRMs for multi-component storage systems. Such systems, made of a combination of multiple disk arrays, parallel file systems, and archival storage are becoming more prevalent as the volume of data that need to be managed grow exponentially with petascale computing.

Read more about Storage Resource Manager:  Use of SRMs in Real Applications

Famous quotes containing the words storage, resource and/or manager:

    Many of our houses, both public and private, with their almost innumerable apartments, their huge halls and their cellars for the storage of wines and other munitions of peace, appear to me extravagantly large for their inhabitants. They are so vast and magnificent that the latter seem to be only vermin which infest them.
    Henry David Thoreau (1817–1862)

    If there is nothing new on the earth, still the traveler always has a resource in the skies. They are constantly turning a new page to view. The wind sets the types on this blue ground, and the inquiring may always read a new truth there.
    Henry David Thoreau (1817–1862)

    Nothing could his enemies do but it rebounded to his infinite advantage,—that is, to the advantage of his cause.... No theatrical manager could have arranged things so wisely to give effect to his behavior and words. And who, think you, was the manager? Who placed the slave-woman and her child, whom he stooped to kiss for a symbol, between his prison and the gallows?
    Henry David Thoreau (1817–1862)