April offers a look at the October 2019 woes

The period of Thursday, October 24th through Sunday 27th October, 2019 saw Second Life encounter a rolling set of issues which finally came to a head on Sunday, October 27th. The issues affected many Second Life users and services from logging-in through to inventory / asset handling.

As has become the case with these matters, April Linden, the Second Life Operations Manager, has provided a post-mortem blog post on the issue and her team’s work in addressing the problems. And as always, her post provides insight into the complexities in keeping a platform such as Second Life running.

In short, the root cause of the weekend’s upsets lay not with and of the Second Life services but with one of the Lab’s network providers – and was exacerbated by the fact the first couple of times it happened – Thursday and Friday – it appeared to correct itself on both occasions before the Lab could fully identify the root cause.

April Linden

On Sunday, the problems started up again, but fortunately April’s team were able to pin down the issue and commence work with their provider – which obviously meant getting Second Life back on an even keel was pretty much in the hands of a third-party rather than being fully under the Lab’s control.

Our stuff was (and still is) working just fine, but we were getting intermittent errors and delays on traffic that was routed through one of our providers. We quickly opened a ticket with the network provider and started engaging with them. That’s never a fun thing to do because these are times when we’re waiting on hold on the phone with a vendor while Second Life isn’t running as well as it usually does.

After several hours trying to troubleshoot with the vendor, we decided to swing a bigger hammer and adjust our Internet routing. It took a few attempts, but we finally got it, and we were able to route around the problematic network. We’re still trying to troubleshoot with the vendor, but Second Life is back to normal again.

– Extract from April Linden’s blog post

As a result of the problems April’s team is working on moving some of the Lab’s services to make Second Life more resilient to similar incidents.

During the issues, some speculated if the problems were a result of the power outages being experienced in California at the time. As April notes, this was not the case – while Linden Lab’s head office is in San Francisco, the core servers and services are located in Arizona. However, resolving the issues from California were affected by the outages, again as April notes in her post.

It’s something I’ve noted before, and will likely state again: feedback like this from April, laying out what happened when SL encounters problems are always an educational  / invaluable read, not only explaining the issue itself, but in also providing worthwhile insight into the complexities of Second Life.

2019 Simulator User Group week #44

Abrahamstrup, September 2019 – blog post

Simulator Deployments

At the time of writing, no deployment notes had been published. However:

  • There was no deployment to the SLS (main) channel on Tuesday, October 29th, leaving it on server release 2019-10-03T01:12:11.531528.
  • There are two RC deployments planned for Wednesday, October 30th:
    • 2019-10-24T19:07:13.532143, comprising further internal script improvements, internal logging changes and improvements to simulator state saves.
    • 2019-10-26T00:06:48.532192, comprising a previously released hotfix to fix teleports being 5%-7% less reliable and makes the simulator take a little bit longer to report as “Up” to the Lab’s internal tools to more accurately reflect when residents can actually access a region.

SL Viewer

The Project Muscadine (Animesh follow-on) project viewer updated to version on October 28th. The update brings it to parity with the release viewer, but contains not project updates.

. The rest of the viewer pipelines remain as follows:

  • Current Release version, formerly the Vinsanto Maintenance RC viewer, dated September 17th, promoted October 15th – NEW.
  • Release channel cohorts (please see my notes on manually installing RC viewer versions if you wish to install any release candidate(s) yourself):
  • Project viewers:
    • Copy / Paste viewer, version, October 21st.
    • Legacy Profiles viewer, version, September 17th. Covers the re-integration of Viewer Profiles.
    • 360 Snapshot project viewer, version, July 16th.

Script Event Order

It was asked if the script updates would affect the order in which script events are handled, Rider Linden stated:

Some events have always had priority just by virtue of the order in which they were collected. The order of collection has changed. For instance, sensor events were collected and posted before chat events and then touch events. Chat events are now posted immediately upon processing in the simulator. It should still be FIFO… just don’t bet on what event gets collected when.

In addition, it was noted in regards to event messages:

  • Generally, event handling should not to be counted on in any sort of coding since it may change again in the future.
  • Link messages:
    • If multiple link messages are sent from a single source to a single receiver script, the ordering should be preserved. Similarly, when using llLinkMessage to send a message from script A to script B in the same prim, they are posted immediately, and the order is maintained.
    • If the same message is sent to link A and then to link B, the order the links get them is not always the same. Similarly, if script A and script C are using llLinkMessage to post to B, all bets are off which gets there first.