Even a robot can have a memory lapse

CuriosityOn Wednesday February 27th, after some 200 days of near-flawless operations on Mars, Curiosity had its first major malfunction. Up until that point, the rover had been operating using one of its two on-board computers – the so-called “A-side”, to process all command instructions and manage its activities on Mars.

The problem was first noticed by mission planners when the rover failed to send any recorded information during routine uplinks to Earth, instead only sending current status information. On examination, this data revealed the computer had failed to enter its usual “sleep mode” as planned during the overnight period on Mars. Diagnostic work using one of the test rigs at JPL indicated that the problem appeared to be a corruption in the A-side computer’s flash memory module.

As a result of this finding, all science work on the rover – including the analysis of samples obtained from inside the “John Klein” bedrock were suspended on Thursday February 28th, as the rover was instructed to switch-over to the “B-side” computer, which was powered-up into a “safe mode” of operation in order that the rover’s functions could be maintained while investigations as to the cause of the corruption on the A-side could be further investigated.

“We switched computers to get to a standard state from which to begin restoring routine operations,” Richard Cook, project manager for the Mars Science Laboratory Project at JPL, commented at the time of the switch-over.

Mars Odyssey: swapped computer "sides" in November 2012
Mars Odyssey: swapped computer “sides” in November 2012

Memory corruptions aboard space vehicles are not uncommon, so the majority of NASA’s space missions carry redundant computer configurations. Corruptions can be the result of several things; recently, for example, the Mars Odyssey orbiter vehicle had to switch-over from its “A-side” to its “B-side” due to 11 years of constant operation finally taking its toll on the “A-side”; wearing it out. High-energy solar and cosmic ray strikes can also cause problems, even when the vehicle is shielded (as Curiosity is).

What made the problem with the MSL rover critical is that it occurred with the memory module which acts as the “table of contents” for accessing the computer’s memory, preventing data and instructions from being accessed and causing the computer to enter into an “endless loop”.

Also commenting on the switch-over, Magdy Bareh, leader of the mission’s anomaly resolution team at JPL said, “While we are resuming operations on the B-side, we are also working to determine the best way to restore the A-side as a viable backup.”

since the switch-over, the JPL team have been working a dual-track with the rover; the anomaly team has been attempting to understand what precisely caused the “A-side” module to become corrupted, while mission personnel have entered into a step-by-step process of updating the “B-side” computer with all the relevant data and information required to resume full operations.

A self-portrait: Curiosity images itself using the turret-mounted Mars Hand Lens Imager mounted at the end of its robot arm to generate this mosaic as it sits on the “John Klein” bedrock at “Yellowknife Bay”. The images were captured in early February, before the computer glitch (click to enlarge)

“We need to go through a series of steps with the B-side, such as informing the computer about the state of the rover – the position of the arm, the position of the mast, that kind of information,” Richard Cook stated, updating the media on Monday March 4th, after the rover had successfully switched from “safe mode” to “recovery mode” on Saturday March 2nd, and had resumed communications with Earth via its high-gain antenna on Sunday March 5th.

Work on getting the rover back up to full capacity suffered a slight setback after a moderate solar eruption saw a cloud of charged particles ejected from the Sun at more than 3.2 million kilometres an hour (2 million miles an hour) – directly towards Mars.

Unlike Earth, Mars does not have a planet-wide magnetic field to help protect it from sun outbursts from the Sun, so rather than having geomagnetic storms, Mars tends to experience sudden, sharp spikes in radiation levels.

While this particular solar eruption wasn’t particularly aggressive, and any resultant radiation spike likely to be well within Curiosity’s operational tolerances, mission planners nevertheless decided to order the rover to return to its sleep mode on March 6th, purely as a precautionary measure resulting from the “A-side” glitch, until the worse of any radiation increase had passed. As it turned out, the cloud of charged particles was reported by NASA’s other surface and orbital vehicles to be far less energetic than had been predicted, and Curiosity was once again on the road to recovery by March 8th. Currently, it is anticipated that the rover should reach a “fully operational” status using the “B-side” computer next week.

Related Links

Top of Page

All images courtesy NASA / JPL