Today's research increasingly relies on generating, collecting, organizing, and analyzing large amounts of data. Researchers often find themselves needing to work with more data than personal or laboratory computers are equipped to handle, and in some cases, the same scales used by large corporations and government agencies. Scientific Data use cases explain how researchers work with data, including analysis, data management, and data sharing.
Data Analysis:
These use cases describe how scientists analyze large amounts of scientific data. The data is typically large in volume (larger than one would use on a personal or business computer), but is organized and stored in different ways for different kinds of research. Data might be generated by a single source (a large simulation, for example) or it might come from many sources (observational results from many different instruments or different research teams). The methods of analysis also vary from field to field and problem to problem.
Use Case ID | Title | Use Case Description |
---|---|---|
DA-01 | Discover data analysis resources and documentation | |
DA-02 | Prepare data for analysis | |
DA-03 | Analyze data from research instruments | |
DA-04 | Analyze data generated by a simulation | |
DA-05 | Steer a large computation while it runs | |
DA-06 | Time-critical data analysis | |
DA-07 | Run an interactive data science application using a community resource for back-end computation |
Data Management:
These use cases describe how researchers manage collections of data for shared use or for their own re-use over time. The use cases range from a single research project managing and organizing its own data, to several related projects using each other's data, or to data being prepared for future use in applications that haven't been imagined yet.
Use Case ID | Title | Use Case Description |
---|---|---|
DM-01 | Create and share a data collection | |
DM-02 | Coordinated computing with a shared data collection | |
DM-03 | Automate data ingestion from a set of sensors or instruments | |
DM-04 | Migrate data to a new resource | |
DM-05 | Manually create metadata for a data object | |
DM-06 | Run a researcher-supplied tool to generate metadata for data objects | |
DM-07 | Automatically extract metadata from data objects | |
DM-08 | Store metadata for later use | |
DM-09 | Search metadata for specific objects of interest | |
DM-10 | Add metadata search features to an application | |
DM-11 | Post-allocation data access | |
DM-12 | Large-scale data transfer | |
DM-13 | Small-scale data transfer | |
DM-14 | Scrape public websites to gather data | |
DM-15 | Transfer data between a researcher's cloud storage and a community storage system |
Visualization:
These use cases describe the most common scientific data visualization methods. This is an evolving field of work, since the ability of desktop computers to visualize data is improving rapidly. But it is still common for researchers to need to visualize data at scales that exceed their local resources, requiring them to use high-performance and high-throughput computing resources to advance their work. Large-scale visualization resources must often be used remotely over a research network with the results shown on a local display.
Advanced visualization resources are designed, constructed, and operated by a service provider (SP) organization, such as the Texas Advanced Computing Center (TACC) or University of Utah. A visualization resource may be used in one or more public research computing communities, such as XSEDE or Open Science Grid (OSG).
Use Case ID | Title | Use Case Description |
---|---|---|
VIS-01 | Visualize research data using streaming video | |
VIS-02 | Visualize research data using streaming geometry data | |
VIS-03 | Generate visualization data for later viewing | |
VIS-04 | Visualize and steer a simulation running on a remote resource | |
VIS-05 | Visualize a simulation as it runs on a remote resource | |
VIS-06 | Visualize research data using a web application |