Scientific Data Use Cases

Today's research increasingly relies on generating, collecting, organizing, and analyzing large amounts of data. Researchers often find themselves needing to work with more data than personal or laboratory computers are equipped to handle, and in some cases, the same scales used by large corporations and government agencies. Scientific Data use cases explain how researchers work with data, including analysis, data management, and data sharing.


Data Analysis:

These use cases describe how scientists analyze large amounts of scientific data. The data is typically large in volume (larger than one would use on a personal or business computer), but is organized and stored in different ways for different kinds of research. Data might be generated by a single source (a large simulation, for example) or it might come from many sources (observational results from many different instruments or different research teams). The methods of analysis also vary from field to field and problem to problem.

(7 use cases)
Use Case ID Title Use Case Description
DA-01 Discover data analysis resources and documentation
DA-02 Prepare data for analysis
DA-03 Analyze data from research instruments
DA-04 Analyze data generated by a simulation
DA-05 Steer a large computation while it runs
DA-06 Time-critical data analysis
DA-07 Run an interactive data science application using a community resource for back-end computation


Data Management:

These use cases describe how researchers manage collections of data for shared use or for their own re-use over time. The use cases range from a single research project managing and organizing its own data, to several related projects using each other's data, or to data being prepared for future use in applications that haven't been imagined yet.

(15 use cases)
Use Case ID Title Use Case Description
DM-01 Create and share a data collection
DM-02 Coordinated computing with a shared data collection
DM-03 Automate data ingestion from a set of sensors or instruments
DM-04 Migrate data to a new resource
DM-05 Manually create metadata for a data object
DM-06 Run a researcher-supplied tool to generate metadata for data objects
DM-07 Automatically extract metadata from data objects
DM-08 Store metadata for later use
DM-09 Search metadata for specific objects of interest
DM-10 Add metadata search features to an application
DM-11 Post-allocation data access
DM-12 Large-scale data transfer
DM-13 Small-scale data transfer
DM-14 Scrape public websites to gather data
DM-15 Transfer data between a researcher's cloud storage and a community storage system



These use cases describe the most common scientific data visualization methods. This is an evolving field of work, since the ability of desktop computers to visualize data is improving rapidly. But it is still common for researchers to need to visualize data at scales that exceed their local resources, requiring them to use high-performance and high-throughput computing resources to advance their work. Large-scale visualization resources must often be used remotely over a research network with the results shown on a local display.

Advanced visualization resources are designed, constructed, and operated by a service provider (SP) organization, such as the Texas Advanced Computing Center (TACC) or University of Utah. A visualization resource may be used in one or more public research computing communities, such as XSEDE or Open Science Grid (OSG).

(6 use cases)
Use Case ID Title Use Case Description
VIS-01 Visualize research data using streaming video
VIS-02 Visualize research data using streaming geometry data
VIS-03 Generate visualization data for later viewing
VIS-04 Visualize and steer a simulation running on a remote resource
VIS-05 Visualize a simulation as it runs on a remote resource
VIS-06 Visualize research data using a web application