Environmental Data Cube Support System (EDCSS) Roles and Processes. 20 August 2013

Size: px
Start display at page:

Download "Environmental Data Cube Support System (EDCSS) Roles and Processes. 20 August 2013"

Transcription

1 Environmental Data Cube Support System (EDCSS) Roles and Processes 20 August 2013

2 Table of Contents Overview... 3 EDCSS Support Staff... 5 Systems Engineer... 5 Project Workflow Manager Admin... 5 Provider SME... 5 EDCSS User Roles... 7 Customer... 7 User Group SME... 7 Hardware Requirements... 8 Web Application Layer... 8 Infrastructure Layer... 8 Resource Layer... 9 Executing Customer Support Projects Appendix Terminology... 14

3 Overview The Environmental Data Cube Support System (EDCSS) provides all required capabilities to produce and distribute an integrated suite of environmental support products for the M&S community. The EDCSS architecture, depicted in Figure 1, is made up of four primary components as described below. The EDC Project Workflow Manager is a project-oriented web application that manages all end-user requirements for projects containing specific support products. It is the interface for end-users to specify product and project requirements and submit projects for execution. This application manages a project s workflow and requirements but depends on one or more EDCSS Provider Sites for actual product generation. An EDCSS Provider Site is a web application and backing framework responsible for resource management and product generation. An EDCSS Provider Site receives and responds to resource and product requests from the EDCSS Project Workflow Manager. All resources and product generation capabilities reside at EDCSS Provider Sites, which may be remote or co-located with the EDC Project Workflow Manager. The result of a completed Project submitted to the EDC Project Workflow Manager is a single Event Support Package containing all required products in ready-to-play formats, and a hosted project review site with a web interface to browse all completed products. The EDC Distributor provides for access to completed Event Support Packages generated by the EDC Project Workflow Manager. It is a separate component allowing for its deployment on one or more network domains during the execution of an event. All products are hosted on a searchable Event Support web site and available via web services for machine-to-machine distribution. The EDC Runtime is a software library that can optionally be used by application developers to streamline the integration of EDCSS generated data or effects products. For example, the EDC Runtime is currently employed by JBUS to facilitate EDC integration with the Navy Continuous Training Environment HLA federation and is utilized by OneSAF for environmental data support. Other integrations efforts are ongoing.

4 Figure 1: Environmental Data Cube Support System Architecture

5 EDCSS Support Staff The following positions are identified based on current system construct and projected uses of EDCSS. The operation and maintenance of the EDCSS falls into three categories of staff: Systems Engineer This person is responsible for the hardware, operating system, and network environment of the system, as well as installation and assistance with troubleshooting of the EDCSS software. Individuals in this role should be fluent in Unix-based operating systems, Intel/AMD-based hardware architecture, web application deployment, and network hardware, protocols, and security issues. Project Workflow Manager Admin The Project Workflow Manager Administrator is the lead technical person responsible for the overall operation of the EDCSS Project Workflow Manager. This will involve assisting User Group SME s in the definition of Products, Product Groups, and Resource Networks, ensuring connectivity to Provider Sites and assisting Provider SME s, and troubleshooting any issues that arise during Project execution. It is assumed that the Project Workflow Site Admin is fully trained on all EDCSS technology components and is well versed in all of its product generation capabilities. This role is not to be filled by a traditional system administrator focused merely on hardware and application up time ; rather, it is filled by an individual with thorough knowledge and understanding of the EDCSS application and its underlying processes. In addition, this person should be comfortable with database environments and web application deployments. Provider SME The Provider SME operates the EDCSS Provider Site and provides domain expertise for all products offered from that site. This includes the underlying data and modeling capabilities required to build the final products. Provider SME s should have science backgrounds, and be technically capable of fully configuring new EDC Products within their local Provider Site. Provider SME s will be required to interact with User Group SME s to discuss product requirements and/or inquiries about the capabilities and limitations of their local resources. Individuals in this role should be comfortable working in a Unix environment and with Java, C, and Fortran custom software (as opposed to packaged commercial applications). While it is not a requirement that they have extensive software development experience, they should be comfortable with those concepts. Ideally this role is filled by one or more individuals with a mix of scientific computing and technical expertise. The estimated labor to maintain and operate the initial EDCSS system is:

6 Systems Engineer: 0.25 FTE Project Workflow Manager Admin: 0.5 FTE Provider SME: 1.0 FTE

7 EDCSS User Roles Customer The customer is the end-user of the resulting EDCSS Support Package, and the originator of the requirements. Although is it possible that customers will never directly utilize the EDCSS Project Workflow Manager interface and/or submit a new Project request on their own, the system is designed such that they may fully define and submit a project with minimal outside assistance. Therefore customers should have a good understanding of the EDCSS Project Workflow Manager and the process of defining projects, executing searches, reviewing search results, and selecting suitable historical dates to support their project. It is quite likely they will have no understanding of EDCSS Provider Sites. User Group SME The EDCSS Project Workflow Manager has a concept of User Groups to organize end-user Products, Components, and Resource Networks. The User Group SME is responsible for maintaining all of these definitions on behalf of their User Group. They are not domain experts in environmental modeling and do not need to understand the full mechanics of product generation; rather, it is expected that they fully understand the simulation and product requirements of the User Group they represent. This individual will be the primary customer liaison for the EDCSS and must be able to interact at a meaningful level with the technical representatives from the one or more simulation programs associated with an event that EDCSS is supporting, often times helping to craft their environmental requirements on their behalf. No specific technical skills are required but the individual should be fully familiar with EDCSS concepts and functionality, comfortable working with office productivity software and web navigation, and be comfortable performing basic data analysis given a stable set of tools intended for that purpose.

8 Hardware Requirements The following hardware requirements are based on current system construct and projected uses of EDCSS. The EDCSS is designed as a distributed system of supporting capabilities to produce an integrated suite of environmental support products for the M&S community. Built into its architecture is the assumption that environmental data resources may be hosted by remote sites and that some or all of the EDCSS product offerings may rely on product generation services at remote EDC Provider Sites. It is also quite possible that all elements of an EDCSS system are local. As such, specification of a complete system is complicated by this flexibility and uncertainty of how much capability is hosted locally. The EDCSS is a three-tier system architecture defined by a Web Application Layer, an Infrastructure Layer, and a Resource Layer. The Application Layer hosts the customer applications in a web accessible environment. The Infrastructure Layer contains most of the EDCSS technology itself and is responsible for all data processing and product generation services. The Resource Layer provides efficient access to data resources on disk, as well as interfaces to modeling capabilities to generate new reference data sets. Only the Application Layer is accessible by outside end-users (customers) of the system. Web Application Layer The Web Application Layer for EDCSS hosts the Project Workflow Manager and/or Provider Site top-level web applications, which are fairly small, but must be hosted on servers with sufficient memory and bandwidth to manage multiple user sessions. If the Project Workflow Manager and Provider Sites are co-located, they may be hosted on the same machine. Any remote Provider Sites of course require separate hardware. Web servers such as the following are adequate to host the Web Application Layer: Intel or AMD 2.0+ GHz Processor 2 4 GB RAM 500 GB Hard drive It is recommended that two web server systems be available to allow for deployment of multiple versions and/or allow for failover support during events. Infrastructure Layer The infrastructure layer hosts the Provider components including the COSINE framework which includes a Catalog Server for resource registration, the Order Server for primary logic control of all data requests being processed, and multiple execution and data servers to provide sufficient

9 scaling for multi-user support. This layer also supports the EDCSS Product Generation Services which range from simple graphics engines to physics-based Tactical Decision Aides such as TAWS that are to be executed thousands of times in support of Hypercube generation. Midrange servers with specifications such as the following are adequate to host the Infrastructure Layer: Intel or AMD 3.0+ GHz Processor 8 GB RAM 1 TB Hard drive It is recommended that two Provider Sites are deployed so that load and resource responsibility are shared. An additional two (2) mid-range Linux servers are recommended for failover support. Resource Layer The data resources for the operations of EDCSS are currently on the order of 35 TB. This archive system includes seven database servers plus a 25 TB scalable storage array. Each database server is a mid-range workstation with specifications similar to the following: Intel or AMD 3.0+ GHz Processor 4 to 8 GB RAM 1 3 TB GB Hard drive

10 Executing Customer Support Projects The first task is to define the project at the EDC Project Workflow Manager. The first step is to specify the project s area of interest, required products, and execution dates. Next a suitable resource network must be selected. This resource network will include a searchable archive and a resource for one or more environmental domains. These latter resources may be on demand, meaning they require execution of a model such as WRF or COAMPS. In summary the following tasks must be completed: ********** These tasks can be completed by end-users or the User Group SME *********** Define Project at the EDC Project Workflow Manager: o Define the AOI o Select required Products o Define Project dates o Select Resource Network Next the scenario requirements are defined. This includes one or more operational/system impacts and/or environmental parameter search criteria, the area and time range over which to search, and the degree of impact (favorable, marginal, unfavorable) or parameter limits (e.g., Wind Speed < 20kts) in daily or 3-hourly increments. A scenario search is then executed. In summary the following tasks must be completed: Define Scenario requirements: o Define impacted AOI o Select one or more operational/systems impacts o Indicate search criteria in daily/3-hourly increments Execute search EDCSS will return search results and illustrate the requested and actual impacts for multiple sets of historical dates. From this the best candidate must be selected. Then, optionally, Preview Products may be ordered to provide a preview of the selected scenario. Upon delivery and review of Preview Products, the historical dates may be accepted or rejected. In the latter case, another candidate must be evaluated, and this process repeats until a suitable historical scenario is identified. In summary the following tasks must be completed: Review Search Results Select best match from returned candidates Generate Preview Products (optional)

11 o Review Preview Products Confirm candidate selection or select new If the project is to be built from an existing resource, it may now be submitted to build. More often, archives of historical scenarios are not available at necessary spatial and/or temporal resolution, so a model run must be executed to produce a new resource. In summary the following tasks must be completed: If production resources already exists, proceed to Build project to begin product generation If production resource involves on-demand modeling (i.e., WRF execution), proceed to Build project to initiate model run ***************** Below tasks are to be completed by Provider SME ***************** Model run submissions are automated for WRF but require oversight by the Provider SME, who must monitor the model execution and address any issues that arise (for example, in some instances the model may become unstable and fail, so model configuration may need to be adjusted and the job resubmitted). Model execution may take 1-5 or more days depending on the size of the request. In summary the following tasks must be completed: Monitor model execution Await model completion (time dependent on project definition, typically 1-5 days) Verify successful model execution Upon successful completion of the model run, the output data is automatically configured and made available to support the requesting project. This automated process includes the generation of metadata that reflects the content and coverage of the model output, plus standard virtual metadata that defines content to be derived by the COSINE framework (e.g., derive cloud properties from profiles of temperature and relative humidity). In addition, the resource is configured into the COSINE servers and enabled at the EDC Provider Site. After awaiting completion of the model and auto-configuration of the resulting resource, the project resumes at the EDC Project Workflow Manager and the project s status becomes Building Products. It is difficult to estimate the time to complete production because the size and complexity of a project and its products can vary greatly. Small projects with a few products may complete in less than an hour, while projects spanning large temporal and geographic extents may require 4-5 days to complete.

12 When a project completes, its status in the Project Workflow Manager becomes Complete, and the project page includes a button to the Review site which hosts the completed products. An is sent to the requester and Project Workflow Manager Admin to notify them of completion. Next validation must be performed. It must be validated that all products built successfully, and there is consistency between products. If end-user tools are available, products are checked for compatibility. In summary the following tasks must be completed: Validate all products built Validate consistency between all products and scenario Validate products in end-user tools if available Finally, the project is prepared for delivery. This begins with reference to the hosted Review site, from which the customer may review the project. Assuming acceptance by the customer, the project may be delivered as an EDC Distributor pack, ready to be hosted on an EDC Distributor, or exported to a static HTML site. In the latter case, any custom content (e.g., maps, briefs, etc.) may optionally be edited into the HTML. Finally, the delivery is burned to DVD or made accessible for download. In summary the following tasks must be completed: Provide link to hosted Project Export Project Distributor pack or static HTML Burn delivery to DVD or provide as download After the project event occurs, the project is reviewed and documented. This includes a brief report summarizing the actions taken, the satisfaction level of the customer, and lessons learned. Finally the project is archived for potential reuse as appropriate. In summary the following tasks must be completed: Document actions taken and satisfaction level of customer Document lessons learned Archive project for reuse as appropriate

13 Figure 2: Workflow of the customer support process.

14 Appendix Terminology COSINE (Common Open Services for Integrated Natural Environment): the framework implemented by EDC Providers to maintain metadata catalogs, coordinate and fulfill resource orders, and derive additional data content. EDC Project Workflow Manager: the main EDC web application which end-users access to define and submit Projects and define requirements for Products and Product Groups. EDC Provider Site: an EDC web application and backing framework responsible for resource management and product generation. An EDC Provider Site receives and responds to resource and product requests from the EDC Project Workflow Manager. EDC Distributor: an EDC web application for hosting completed projects and providing products via web services and a browsable web site. Environmental Domain: one of the domains of the natural environment: atmosphere, ocean, terrain, or space. HLA (High Level Architecture): a general purpose architecture for distributed computer simulation systems, allowing them to communicate data and synchronize actions. Hypercube: a multi-dimensional look-up table, often composed of pre-computed effects data as a function of tactical and weather parameters. For example, for a given weather scenario, an IR sensor system might be characterized by its probability of detection as a function of viewing angle, sensor altitude, target type, location, and time of day. JBUS (Joint BUS): an application which provides a plug-in framework to translate data formats and publish to one or more standard network communication protocols such as HLA, DIS, and TENA. Product: environmental information packaged as data, text, imagery, or pre-computed environmental effects. Product Group: a collection of Products that are targeted to an exercise participant. Project: a representation of a desired exercise; composed of a set of Product Groups, which in turn contain Products, for a given area of interest and temporal extent. Operation/System: real-world objects that participate in some exercise such as a sortie. A System is also a collection of real-world objects but is used primarily in a support capacity, for example, a collection of sensor arrays.

15 Resource: a data source composed of archive or on-demand model data for a particular area, time, and environmental domain, e.g., WRF atmospheric data for southeast US for September Resource network: a logical grouping of resources that represent the natural environment across multiple environmental domains for a particular area and time, e.g., WRF atmospheric data, Wavewatch 3 ocean surface data, and NCOM ocean volume data for southeast US for September Every Project is supported by one Resource Network. SME: a subject matter expert. User Group: a collection of users that may be associated with common Projects, Product Groups, and Resources.