A software issue that forced NASA's Mars rover Curiosity to switch to a redundant main computer has been resolved, enabling the unmanned vehicle to resume its weather, radiation and rock sample monitoring. Curiosity had been running on its "B-side" computer without redundancy for nearly a month following a memory glitch, while a more recent file error prompted it to go into a precautionary standby mode for more than a week.
"We are back to full science operations," said Curiosity Deputy Project Manager Jim Erickson.
The Curiosity rover includes two redundant main computers to ensure backup if one system fails. Each operates its own set of hard-linked instruments, including six engineering cameras. Only the ones linked to the active computer can be used, and the A-side computer was active until February 28, when a memory error prompted a switch to the B-side. NASA engineers then had to confirm that these cameras would work properly. The Curiosity rover is now operating on the B-side computer for the first time since April 2012, when the vehicle was on the way to Mars.
A separate software issue on March 16 that prompted the spacecraft to automatically put itself into a "safe mode" was also diagnosed, and engineers determined how to prevent it from happening again. That issue was due to an error in which some files still in use by the rover were scheduled for deletion, Space.com reported. These changes have arrived just before a moratorium in sending commands to the rover is set to go into effect. From April 4 to May 1, the sun will be between Earth and Mars, which could create a possible interference and corrupt commands.
Protecting software in space
Ensuring the proper operation of spacecraft software has been an ongoing challenge for NASA, given the high stakes and low margin for error. The Curiosity rover, for instance, is $2.5 billion investment at the center of the organization's Mars Science Laboratory mission. Despite Curiosity's nearly flawless operation, NASA has had to overcome software bugs that threatened to undermine other rovers in the past.
A recent ITWorld article chronicled the struggles with the Mars Spirit rover in 2004, in which a DOS library design flaw, a bug in third-party software and multiple configuration errors combined to nearly shut the rover down completely by overloading its memory. Engineers were ultimately able to fix the bug and reformat the memory, and Spirit operated until March 22, 2010, far longer than initially planned. It was eventually concluded that the error could have been prevented but that a compressed development schedule allowed the DOS issue to go overlooked as a low priority.
"This kind of problem sound familiar to anyone?" ITWorld's Phil Johnson wrote. "That is, having to rush the development of something and thereby not fully develop or test the system?"
To improve software quality and minimize the likelihood of glitches, NASA now uses tools such as static analysis software. However, as the recent Curiosity errors showed, there is always room for improvement and more thorough source code analysis.
Software news brought to you by Klocwork Inc., dedicated to helping software developers create better code with every keystroke.