Phase I Submission Name of Program: International Space Station Program

Size: px
Start display at page:

Download "Phase I Submission Name of Program: International Space Station Program"

Transcription

1 Phase I Submission Name of Program: International Space Station Program Name of Program Leader: John Shannon, Boeing ISS Program Manager Phone Number: john.p.shannon@boeing.com Postage Address: 3700 Bay Area Blvd, MC HB4-10, Houston, TX, Name of Customer Representative: Ralph A. Grau Phone Number: raphael.a.grau@nasa.gov Category: System Sustainment 2015 AVIATION WEEK PROGRAM EXCELLENCE INITIATIVE 1

2 Bio for program leader: John Shannon, as International Space Station (ISS) program manager, leads the Boeing team in its key integration role for NASA s ISS Program. His responsibilities include overall integration and operations of the ISS orbiting laboratory. Shannon is also responsible for providing vehicle sustaining engineering, expanding scientific and exploration utilization, and implementing vehicle enhancements. Prior to joining Boeing, Shannon served as NASA s Deputy Associate Administrator for Exploration Planning in the Human Exploration and Operations Mission Directorate. In this role, Shannon was responsible for working with NASA and the international partners to define future Human Exploration activities for NASA headquarters. His efforts resulted in the development of a comprehensive plan for a cislunar outpost to be used for the development of technology, science, operational methods, and crew protection for deep space missions. Previously, Shannon served as Program Manager of NASA s Space Shuttle Program, providing executive direction and policy for all aspects of space shuttle processing and development including prelaunch and flight operations. Shannon managed the final fourteen missions of the program, including the last Hubble Space Telescope servicing mission and completion of the U.S. on-orbit segment of the ISS. Prior to this role, Shannon served as the deputy manager and the manager of Flight Operations and Integration for the Space Shuttle Program. Shannon began his career on the Space Shuttle Program as a flight operations engineer in In 1991, he was named the head of the space shuttle guidance, navigation, and flight control section. In 1993, Shannon was selected as flight director for the Space Shuttle Program, the youngest person ever to hold that position. In 2003, he was appointed as the deputy manager of the Columbia Task Force, interfacing daily with the investigative team of the Columbia Accident Investigation Board. Throughout his career, Shannon has received numerous accolades for his technical and managerial leadership such as the NASA Exceptional Service Medal, the NASA Outstanding Leadership Medal, the NASA Distinguished Service Medal, the NASA Exceptional Achievement Medal, the Presidential Rank Award for Distinguished Executive, and the Astronautics Engineer of the Year award from the National Space Club. Shannon s academic achievements include a Bachelor of Science degree in aerospace engineering from Texas A&M University and completion of Harvard Business School s Program for Management Development AVIATION WEEK PROGRAM EXCELLENCE INITIATIVE 2

3 Phase I Program Narrative - 1 BACKGROUND The International Space Station (ISS), a one-of-a-kind research laboratory continuously orbiting the Earth in the harsh environment of space, is the largest and most complex international scientific and engineering space project in history. The ISS was completed in 2011 and is larger than an American football field, including the end zones. The ISS serves as a test bed for building and maintaining large structures in space, for conducting science and technology research leading to discoveries that will benefit life on Earth, and as a proving ground for developments in future human space exploration. Today, five space agencies operate the ISS, which shatters the mold as an engineering, scientific, management, and diplomatic achievement. It has been staffed continuously since November 2, 2000 in support of assembly and science research activities that include several hundred experiments, on topics ranging from human physiology to physical science, with a core emphasis on science performed in the completely unique zero-gravity environment to improve life on earth. Due to the limited availability of onorbit laboratory resources and crew time on ISS, any crew time or programmatic allocated budget for maintenance results in less time or money allocated for scientific endeavors. Leadership in executing solutions to multi-faceted logistical challenges is critical to ISS success. At the heart of the ISS is the Command and Data Handling (C&DH) Subsystem, the network of computers and interconnecting busses that enable control and communication across ISS. It consists of more than 50 U.S. and International Partner flight computers, and more than 300 firmware controllers, interfacing across more than 100 dual redundant MIL-STD 1553 Busses. Flowing across these networks are more than 350,000 unique signals, packaged in hundreds of thousands of data groups, which are acted on by the various processors and ultimately passed to the ground. This software resident on these processors was developed in countries around the globe: Russia, Germany, France, Canada, Italy, Japan, and the U.S., by the various partner space agencies and their contractors. In all, there are more than 10 million source lines of software code resident on these computers (not counting common Microsoft operating systems and applications running on the crew laptops). Adding to ISS complexity is the fact that the network and array of flight processors has grown incrementally as the assembly sequence unfolded. The topology changed, and will continue to change, as elements are added or relocated, requiring reconfiguration of software and databases. The effort s magnitude and changing orbital configuration demanded an underlying architecture and design that would allow the processors predictable access to the interfacing networks, without variability in timing. An ingenious synchronous interface schema was devised and executed to enable such communication and coordinated among all developers. This critical architectural underpinning allowed development and integration to proceed over the years without further concern for interface stability or correctness. With most of the initial development complete, focus has shifted toward productivity enhancements, usability, and operational support. Processes have evolved with program maturation, and the flight software team has positioned itself as a world class organization. This team has been assessed as Capability Maturity Model (CMM) Maturity Level 3 capable, continues to drive productivity higher than predictions, and maintains the defect rates significantly below industry standards. This emphasis on process improvement occurred at a 2015 AVIATION WEEK PROGRAM EXCELLENCE INITIATIVE 3

4 Phase I Program Narrative - 2 time when the software sustaining effort was being consolidated in Houston, and with ongoing product deliveries being made to support on-orbit operations. While execution has been nearly flawless, the occasional on-orbit anomaly has pushed the expert team into service to understand abnormal operational behaviors, and seek root cause for these behaviors. TEACHABLE LESSONS IN SUSTAINING SOFTWARE An example of these abnormal operational behaviors was the December 2013 failure of the Thermal System Pump Module package that controls the ISS heat rejection system and the flow of ammonia through the heat exchangers within ISS, then out to the external radiators. Heat rejection from this system is critical to ISS operations. This failure put the ISS in a zero fault tolerant condition for the thermal systems, leaving the ISS only one failure away from losing thermal heat rejection, without which the ISS subsystem equipment could not remain powered on. This equipment includes the ISS s computer systems, Environmental Control and Life Support System (ECLSS), Communications Tracking and Control (C&T), Guidance Navigation and Control (GN&C) equipment, and payload equipment and science experimentation. The main pump module control valve failure prevented proper flow of ammonia through the pump. The Flight Software Team, in cooperation with operations personnel, determined a secondary valve could be used to control ammonia flow; however, the controls the operations team had available only allowed them to turn the valve completely on or completely off. A way to control the valve in an incremental manner was necessary. For that to happen, the flight software team would have to design and build a software patch which would allow that control. The team worked around the clock, and within just 48 hours, built, tested, and readied the patch for uplink to the vehicle. The patch was loaded on-orbit and used until the failed pump module could be replaced during an unplanned Extra Vehicular Activity (EVA) by the ISS crew. The ability to deploy quick turnaround patches is especially important for ISS because remote mission and safety critical operations are required to run every day of the year. To be able to rapidly react to anomalies, Boeing and NASA created the Quick Turnaround Process, which features the identification of a Boeing engineer as the Expediter. The Expediter is responsible for timely execution of the time critical patch life cycle process, coordinates the participation of all the affected teams, and provides up-to-date status of the progress through the process to management. This focused effort from all teams significantly reduces the cycle time a process that usually takes weeks can now be completed within mere hours. Other anomalies often require the software team to adapt and undertake complex data analysis, to assess telemetry dumped from on-board computer memory to diagnose the cause of a system failure. In these cases, the results of Built In Test (BIT) and error buffers are examined to isolate where the system failed and the root cause. The computer systems have sophisticated trace back functions which allow the team to isolate software problems to the line of code executing when the failure occurred. These anomalies can happen at any time, so call lists are carefully maintained and employed by the Mission Evaluation Room (MER) engineers to activate support. The expectation is for the team to have all necessary data collected and analyzed within one 8- hour MER shift, to allow the ISS to return to nominal operations. Adding to the challenge of the investigations, the ISS computer systems are susceptible to radiation effects caused by high speed charge particles striking computer memory and processing units. These Single Event 2015 AVIATION WEEK PROGRAM EXCELLENCE INITIATIVE 4

5 Phase I Program Narrative - 3 Upsets (SEUs) add another layer of complexity to the analysis, since there is little evidence of their impacts other than a computer failure itself. The essential work of the flight software team is enabled by the services and facilities of ISS Software Development and Integration Laboratory (SDIL), located in Houston alongside the gigantic neutral buoyancy pool where astronauts train for their missions. The laboratory provides the capability to test the integrated station hardware and software in an environment as close as possible to the actual one on-orbit. To the maximum extent possible, the laboratory uses flight equivalent computers, power, and wiring connections to ensure the system timing and interface conditions are realistic. This precision is a necessity, as the software must work correctly from the very first moment it becomes operational on-orbit. The SDIL complex consists of 11 dedicated labs which occupy approximately 30K square feet of facility floor space. These labs embody the entirety of the on-orbit C&DH capability, including all necessary simulators and emulators from each of the international and domestic partners, to replicate on-orbit configuration and activity. Organizational processes supported by these labs include flight software qualification and verification, hardware/software integration, anomaly investigation, mission configuration, and real-time flight following of missions that support program schedules and mission objectives. New software configurations are first qualified in the developer laboratory, then submitted into the SDIL for integration and verification. This carefully orchestrated organizational process ensures that not only do the qualified components work correctly with each other, but also demonstrates that crew and ground operational procedures are compatible with new software loads, and that any required changes are correctly made. The operations community typically supports one major configuration upload per year, and minor uploads as-needed to respond to ISS anomalies or support visiting vehicle launches. This community has weekly access to the SDIL to review new operations products and procedures, and any vehicle subsystem can request time to perform tests to resolve on-orbit anomalies, troubleshoot off-nominal conditions, and perform general technical assessments of subsystem performance. Similarly, International Partners and visiting vehicles can use the SDIL to informally integrate new software and procedures with the U.S. segment prior to a more formal verification activity that is prerequisite to software uplink. The SDIL offers the fidelity and tools to aggressively troubleshoot on-orbit anomalies. The SDIL played a critical role when the ISS experienced a failure scenario affecting the three redundant command and control computers, briefly leaving the station without command and telemetry capabilities. By utilizing the SDIL to replay on-orbit scenarios and investigate anomalous behavior, the Flight Software Team and Operations personnel quickly traced the failure to a hardware design defect in the command and control computer. Carefully analyzed lessons learned from this incident resulted in further software improvements to provide better resilience to primary failure, and ability to transition to lower level backups that enable low rate commanding and telemetry. These contingency strategies continue to evolve today in light of the need to respond to practicalities of station power balance, and communication upgrades. Payload customers are increasingly taking advantage of the SDIL to demonstrate end-to-end 2015 AVIATION WEEK PROGRAM EXCELLENCE INITIATIVE 5

6 Phase I Program Narrative - 4 capability with their payloads, which are qualified at the respective development sites and with the utilization of simulators in venues such as the Huntsville Payload Verification Facility. The Houston SDIL facility allows payload development centers and the Huntsville Payloads Operations Center to integrate payloads in the SDIL. A capability was recently added to allow users at any remote location to directly command assets in the SDIL, including payloads there, and observe system responses at the controlling site. They can demonstrate end-to-end performance from the payload operations center, through the SDIL C&DH assets, to the payload and back. SDIL connectivity with operations centers in Germany, Japan, and Russia provide vehicle and payload testing remotely, just as if they were communicating directly with the onorbit vehicle. This value created capability was proven and demonstrated when recent damaging storms disrupted Huntsville facility operations. Huntsville users connected remotely from their own homes to operate SDIL test assets, and successfully maintained software upgrade schedules. As the ISS Program addresses the challenging issue of new space exploration facets and stakeholders, the team is adapting to complexity by integrating new U.S. visiting vehicle capabilities into the SDIL. When complete, the same service will be offered to commercial crew vehicle developers to prove interfaces and procedures prior to on-orbit execution. This world class facility, now in operation for 18 years, will continue to serve the ISS community for at least another decade. Constant updates continue to add value by reducing the overhead cost curve, enhancing maintainability, and giving developers faster access. Over the last three years, the cost of operations and test was reduced 11% (2012 levels), yet test throughput has increased by 18%, including a doubling of integrated payload testing year over year. These savings and efficiencies enable program choices about new capabilities that enhance payload support. An alternative payload integration capability is planned to allow users to remotely connect a payload to the SDIL, further reducing end-to-end performance demonstration cost, and buying down on-orbit integration risk. Expanded Ethernet capabilities on-orbit came with improvements in the lab, permitting Gigabit test data capture and analysis to fully support payloads integration and troubleshooting. Boeing recognizes that payload developers are eager to use these new capabilities to remotely integrate and test upgraded software loads in a flight-like environment, with the lower cost and enhanced support of participation from their home site the promise is a more thorough checkout, without having to physically transport payload hardware. Only with the above meticulous levels of integration is the ISS successfully sustained as operational 24 hours a day, seven days a week, 365 days a year. This leadership and execution enable the ISS program to excel at ensuring crew safety, on-orbit functionality, and productivity to meet the challenging requirements of this one-of-a-kind world class international laboratory situated in the most challenging frontier for humankind space! 2015 AVIATION WEEK PROGRAM EXCELLENCE INITIATIVE 6