Data Storage & File Naming
NCSU's Office of Information Technology (OIT) Shared Services group has developed text that you can adapt for your specific project to address long-term storage issues related to data management planning.
For more information about data storage options at NCSU contact:
The following text can be adapted for your project to address NCSU-specific resources for long-term storage:
"Long term data storage is available from NC State's Office of Information Technology Shared Services group. Data is stored on a highly scalable, resilient (no single point of failure) storage system. Data is backed up at a data center ~15 miles from the data center where the storage system is located.
Access to data is provided using web servers, ftp servers, or iRODS (Integrated Rule-Oriented Data System) data grid as appropriate for the data types being accessed. Data access servers are provided using virtual servers provisioned in NC State's Virtual Computing Lab (VCL) environment. Current storage system is built using IBM's SONAS (Scale Out Network Attached Storage) storage system. SONAS provides the capability to independently scale data input/output throughput and storage capacity. SONAS storage system can be upgraded/expanded without being shut down. All SONAS elements involved in data access are redundant with automatic failover. NC State's existing system has 360TB capacity and could be expanded to more than 14PB using currently available disks. The SONAS storage system is located on NC State's campus in a secure data center with battery-based uninterruptible power supply and standby diesel generator.
Data stored on the SONAS system is backed up to a tape library located in a data center at MCNC in Research Triangle Park (approximately fifteen miles from NC State's campus). MCNC operates the North Carolina Research and Education Network (NCREN) and has extensive fiber network across North Carolina including a multi-pair fiber ring connecting NC State, MCNC, UNC-Chapel Hill, and Duke University in the Research Triangle region. A dedicated connection on a dense wavelength division multiplexed lambda between NC State and MCNC is utilized for the backup traffic. Utilizing NC State's VCL, various methods of access to the data can be provided based on what is appropriate. Web servers - either centrally managed shared web server or research group managed dedicated web server - providing access using http, https, or ftp protocols or iRODS server providing data grid access are current options available for off campus data access."
There are some fundamental decisions that you need to make when you start your research, and data organization should be within this set. The choices that you make will vary based on the type of research that you do, but everyone must address the same issues.
File Name Example
File Renaming Applications
If you have many files already named and need to revise your naming system, you might consider using a file renaming application such as:
One favorite saying is that the best part about standards is that there are plenty to choose from. This holds true for file formats, and means that it is important to think carefully about what file format will be best for long-term preservation and continued access to your data.
Formats most likely to be accessible in the future are:
Here are some examples of preferred formats:
If your research involves more than one person, tracking changes is a critical element. As you think through how to manage this step, keep the following issues in mind.