April Linden explains August 22nd’s Second Life woes

Tuesday, August 22nd was not a particularly good day for Second Life, with an extended period of unscheduled maintenance with log-ins suspended and those in-world advised to refraining from rezzing No Copy objects, or making any LindeX related transactions, etc.

If these words sound familiar (except the date), it’s because I wrote them a year ago to the day, on August 23rd, 2016, when Second Life experienced some significant issues.

Back then, the problem was the core database. The initial problems on August 22nd, 2017 weren’t software related, nor were they related to the Main (SLS) channel deployment taking place at the time. Instead, they lay with a piece of hardware, as April Linden, writing in the Tools and Technology blog, explained in another concise explanation of the problem, which started:

Early this morning (during the grid roll, but it was just a coincidence) we had a piece of hardware die on our internal network. When this piece of hardware died, it made it very difficult for the servers on the grid to figure out how to convert a human-readable domain name, like www.secondlife.com, into IP addresses, like 216.82.8.56.

Everything was still up and running, but none of the computers could actually find each other on our network, so activity on the grid ground to a halt. The Second Life grid is a huge collection of computers, and if they can’t find other, things like switching regions, teleports, accessing your inventory, changing outfits, and even chatting fail. This caused a lot of Residents to try to relog.

We quickly rushed to get the hardware that died replaced, but hardware takes time – and in this case, it was a couple of hours. It was very eerie watching our grid monitors. At one point the “Logins Per Minute” metric was reading “1,” and the “Percentage of Successful Teleports” was reading “2%.” I hope to never see numbers like this again.

Unfortunately, as April went on to explain, the problems didn’t end there, as the log-in service got into something of a mismatch once the hardware issue had been resolved. Whilst telling viewers attempting to log-in to the grid their attempts were unsuccessful, the service was telling the simulators the log-ins had been successful. Things didn’t start returning to normal once this issue had been corrected.

There is some good news coming out of this latter situation however, as April goes on to note in the blog post:

We are currently in the middle of testing our next generation login servers, which have been specifically designed to better withstand this type of failure. We’ve had a few of the next generation login servers in the pool for the last few days just to see how they handle actual Resident traffic, and they held up really well! In fact, we think the only reason Residents were able to log in at all during this outage was because they happened to get really lucky and got randomly assigned to one of the next generation login servers that we’re testing.

Testing of the new log-in servers has yet to be completed, but April notes that the hope is they be ready for deployment soon.

Thanks once again to April for the update on the situation.

Advertisements

SL project updates 2017-7/1: server, viewer, “blue world” bug fix

Anduril, Anduril; Inara Pey, February 2017, on FlickrAndurilblog post

Server Deployments

In short, there are no deployments scheduled for this week. The Main (SLS) channel will remain on release 17#17.01.27.323172.

While there had been an RC release planned, it apparently didn’t clear QA in time, so all three RC channels will remain on 17#17.01.27.323172 as  well.  However, all three channels will be restarted on Wednesday, February 15th, in keeping with the Lab’s policy or restarting channels every two weeks, whether or not there is an associated deployment.

SL Viewer

The Maintenance RC viewer updated to version 5.0.2.323567 on Tuesday, February 14th.  As reviewed in this blog, this viewer includes a number of updates and new features, including the ability to select your own preferred folders for uploading image, animations, sounds and mesh models.

Outside of this update, the viewer pipelines remain as per the end of week #6:

  • Current Release version: 5.0.1.323027, dated January 25, promoted February 3 – formerly the Maintenance RC viewer.
  • RC viewers:
    • Love Me Render RC viewer version Version 5.0.2.323361, dated February 9th – rendering pipeline fixes and improvements
  • Project viewers:
    • Project Alex Ivy (LXIV), 64-bit project viewer, version 5.1.0.501863 for Windows and Mac, released on January 10
    • 360-degree snapshot viewer updated to version 4.1.3.321712 on November 23, 2016 – ability to take 360-degree panoramic images – hands-on review.
  • Obsolete platform viewer version 3.7.28.300847 dated May 8, 2015 – provided for users on Windows XP and OS X versions below 10.7.

Nvidia Driver 64-bit Viewer “Blue World” Bug

As I reported in week #4, Nvidia’s release of their 378.49 driver on January 24th resulted in many 64-bit viewer users (TPVs and the Lab’s own Alex Ivy 64-bit project viewer) seeing their Second Life world view turn decidedly blue when running with Advanced Lighting Model (ALM) disabled.

The Nvidia 378.66 driver should fix the
The Nvidia 378.66 driver should fix the “blue world” issue for those using 64-bit viewers with ALM disabled

On February 14th, Nvidia release the 378.66 driver package, and this reportedly fixes the SL issues.

SL project updates 2017-4/1: Server, camera pre-sets, Nvidia issue

Devin, Devin; Inara Pey, January 2017, on FlickrDevinblog post

Server Deployments

As always, please refer to the server deployment thread for the latest updates.

On Tuesday, January 24th, the Main (SLS) channel was updates with the same server maintenance package deployed to the RC channels during week #3. This includes a partial fix for (non-public) BUG-3286, “Can’t move object” fail notifications (fixes for regions/objects with longer names are pending), together with enhanced server logging and minor internal server enhancements.

There will be no RC deployment on Wednesday, January 25th – but the RC region will be restarted in keeping with the Lab’s new policy of restarting the channels every 2 weeks, regardless of whether or not there is an associated deployment.

The next RC deployment is expected to be week #5 (commencing Monday, 30th January, 2017).

SL Viewer

No changes since my last update. The status of viewers in the pipeline remains thus:

  • Current Release version: 5.0.0.321958, dated December 1st, promoted December 5th, 2016 – formerly the Project Bento RC viewer
  • Release channel cohorts:
    • Maintenance RC viewer, version 5.0.1.322791, dated January 12th
  • Project viewers:
    • Project Alex Ivy (LXIV), 64-bit project viewer, version 5.1.0.501863 for Windows and Mac, dated January 10th
    • 360-degree snapshot viewer, version 4.1.3.321712, dated November 23, 2016 – ability to take 360-degree panoramic images – hands-on review
  • Obsolete platform viewer, version 3.7.28.300847, dated May 8, 2015 – provided for users on Windows XP and OS X versions below 10.7.

Camera Presets

As I noted in a recent TPVD meeting update, Jonathan Yap is working on a code contribution for the official viewer which will allow users to set and save their own preferred camera presets in the viewer.

The idea is that, like the graphics presets functionality Jonathan contributed to the viewer in 2016, users will be able to define their own placements for the SL camera around their avatar (e.g. an over-the-should view, a view from overhead, etc.), which can then be saved and selected / used as required. Jonathan has only recently started on the work – which has an associated feature JIRA, BUG-2145 – but that should hopefully change once various decisions have been made by the Lab.

Nvidia Driver 378.49 + 64-bit Viewer Bug

Nvidia release their 378.49 driver on Tuesday, January 24th, and it can cause an unusual bug / issue with 64-bit viewers. The problem was first noted on Firestorm 5.0.1 (see: FIRE-20774), but I have repro’d it on the Lab’s own 64-bit project viewer (version 5.1.0.501863 at the time of writing) and  on Alchemy 4.0.0 (a crash issue had prevented comprehensive testing on Alchemy 5.0.0 at the time of writing).

The Nvidia 378.49 driver bug which can occur with 64-bit viewers when ALM is disabled, as seen on a 64-bit version of Windows)
The Nvidia 378.49 driver bug which can occur with 64-bit viewers when ALM is disabled

The issue only manifests when Advanced Lighting Model (ALM) is disabled in a 64-bit viewer, and renders the in-world view with an odd blue tinge which almost looks like the blue colour channel is impinging on the red channel. As noted in the Firestorm JIRA, enabling ALM can prevent the issues, as can toggling Glow off when ALM is disabled. See the Firestorm JIRA for workarounds, should you encounter the problem.

How the same scene looks in the same viewer (SL Alex Ivy 64-bit project viewer for Windows, version 5.1.0.501863 at the time of writing)
How the same scene looks in the same viewer (SL Alex Ivy 64-bit project viewer for Windows, version 5.1.0.501863 at the time of writing)

The issue was raised at the Simulator User Group meeting on Tuesday, January 24th, a JIRA for the issue on the Lab’s 64-bit project viewer is available on BUG-41294.

 

SL project updates 2017-2/1: 64-bit viewer and Monday Blues

Nagare no Shimajima, Restless Times; Inara Pey, January 2017, on FlickrNagare no Shimajima, Restless Timesblog post

Server Deployments

There are no planned deployments for the week. However, all servers on the three RC regions will be subject to a rolling restart. This is in accordance with the Lab’s new policy of restarting channels every fortnight, whether or not there is an associated deployment. As the Main (SLS) channel underwent a restart on Tuesday, January third, server on this channel were not restarted this week.

SL Viewer

Project Alex Ivy

The 64-bit versions of the official viewer arrived in project viewer form on Tuesday, January 10th, under the code name Project Alex Ivy – which I take to be a reference to 64-bit (LXIV being 64-bit in Roman numerals, hence aLeX IVy).

The viewer, version 5.1.0.501863, has been built using the newly updated and upgraded libraries and build process the Lab has been putting together, which will also be used for 32-bit Windows builds. Thus, the project viewer is available in three flavours:

  • 64-bit Mac
  • 64-bit Windows
  • 32-bit Windows.

There is no Linux viewer as yet, but the Lab has indicated it is their intention to provide one, although TPVs and open-source contributors are likely to still be asked to help with its ongoing support.

Additionally, the following points, as specified in the release notes, should be underlined (although please ensure you read the release notes in full if you intend to try this viewer:

  • The Mac build has several known limitations:
    • There is currently no Mac Havok build,so pathfinding paths cannot be visualised, and it may not be possible to upload mesh assets.
    • Video media using QuickTime does not play.
  • The 64-bit version will not run on Windows 10 systems with Intel HD 2000/3000 GPUs and may not run on other systems that do not have GPUs explicitly supporting Windows 10.

These shortfalls will be addressed as the viewer progresses through the project and release candidate phases to release status in the next weeks / months. Once released, it will signal the end of the 32-bit MAC version of the viewer (and possibly the 32-bit Linux version). The Windows version will continue to be available as a 32-bit build as well as having the new 64-build available.

Also, note that this viewer doesn’t include any functional updates / changes to the existing viewer.

Remaining Viewers Pipelines

Outside of the 64-bit project viewer, the various viewer pipelines remain as my last SL project update:

  • Current Release version: 5.0.0.321958, dated December 1st, promoted December 5th – formerly the Project Bento RC viewer
  • Maintenance RC viewer, version 5.0.1.322513, dated December 21st – some 42 fixes and improvements + Bento support
  • 360-degree snapshot project viewer, version 4.1.3.321712, dated November 23rd – ability to take 360-degree panoramic images – hands-on review
  • Obsolete platform viewer version 3.7.28.300847, dated May 8, 2015 – provided for users on Windows XP and OS X versions below 10.7.

Monday Outage

On Monday, January 9th, many users were hit with significant issues, with many finding themselves unable to log-in, or being disconnected from the simulators and unable to log back in. On Tuesday, January 10th, April Linden from the Ops team posted another of her excellent post-mortem blog posts on what happened, and I recommend it as a worthwhile and informative read.

In essence a failure within a third-party provider used by the Lab failed to trigger the expected automatic switch-over of connections for all users accessing Second Life through that provider. As a result, those users were disconnected from the service, and due to the volume of people trying to re-connect, couple (I assume with those simply trying to log in, unaware of problems) generated a backlog, forcing the Lab to bring additional log-in servers on-line.

Once again, April does an excellent job in explaining things – revealing more of the complexities of SL in the process (which, as I’ve oft said in the past, goes well beyond just the simulator servers), and also offers apologies for the Monday problems.