Project Shining: what it means for the viewer

On the 29th June, Linden Lab announced Project Shining, aimed at improving avatar and object streaming speeds. At the TPV/Developer meeting on Friday 13th July, the project was discussed in terms of how the various elements within it will affect Second Life viewers.

The following is a summary of that discussion, based on the recording of the meeting, and focused primarily on the viewer changes / updates that will be most directly seen / felt by the majority of users.

HTTP Library

Commencing at 22:30 into recording.

The aim of this project is to improve the underpinning HTTP messaging that is crucial to simulator / simulator and simulator / Viewer communications. Monty Linden is leading this project.

Key points:

  • LL will release a project viewer containing a new “wrapper” implemented around how data is handled and a new texture fetch library  (see time frame comments at the end of this article)
  • Providing there are no major problems with the project viewer, the initial code release will move to a release version of the viewer
  • This will be followed by changes to group services and a “more ubiquitous” use of the library in the viewer – which is where Oz’s warning to TPV developers comes into play, as some services and the behaviours will start to change to improve throughput and reliability – and may even help improve the SL experience for those on older routers.

As a side note, some of this work has involved router testing aimed at determining what router hardware is compatible with Second Life. While it is hoped that work around the HTTP libraries will improve the SL experience for some using older router hardware as noted above, the tests have revealed that certain types of older router – Linksys WRT and Belkin G series routers were specifically named – are not compatible with running Second Life.

Avatar Baking

Bake fail: a familiar problem for many

Commencing at 32:38 into the recording.

The aim of this work (Project Sunshine) is to improve issues around avatar baking and to eliminate bake fail issues. It will primarily focus on moving the emphasis for the baking process from the viewer to a new Texture Compositing server. The viewer will retain some elements involved in avatar baking – the actual baking of the avatar shape (i.e. shape values and IDs) will still take place on the viewer side, for example.

Precisely how this new service will work on the server-side of things is yet to be fully determined by Linden Lab. However, work is progressing on the viewer side of the equation, with the current key points as follows:

  • The new service will use the Current Outfit folder to drive the new baking service
  • TPVs not currently supporting Current Outfits will have to implement it, otherwise they will effectively fail on avatar baking
  • The basic process will be that when it is time to send a rebake request (e.g. after a user has finished editing their appearance) the viewer must send a new message to the baking service which effectively says, “Look at the contents of my Current Outfit folder and give me back a new appearance based on that”
  • Viewers in general will have to support this new message that is sent to the service, and change how they perform the fetching of avatar textures; for the technically inclined, this will be HTTP without UDP fallback.

Currently, the plans is for LL to integrating the new way of doing avatar baking into their viewer code, which will be available for TPVs to integrate – although none of the Linden Lab 1.x code will be updated to support the new process, so this will effectively break their own Viewer 1.23.5, which currently is still in use within SL.

The viewer code will support both the “current” method of avatar baking (within the viewer itself) and the new baking service (using the Texture Compositing server) until the new service is fully rolled out across the grid. This means that if a user is in a region that does not make use of the new baking service, avatar baking will continue to be handled using the viewer-side mechanism we currently have. However, if the user is on a region that utilises the new baking service, avatar baking will be handled through that. The viewer will be able to recognise whether it is connected to a region supporting the “new” method through the region capabilities.

In order to ensure as smooth a transition to the new baking process as possible, LL are proposing a relatively long lead-in to the new service, making the code available well ahead of the new service being enabled, allowing TPVs to integrate it into experimental builds. The server-side changes will initially be implemented on a number of beta grid regions for testing with viewers there, prior to being scaled-up. The server changes will then be released onto the main grid in a controlled manner and then scaled up from there.

What Does This Mean for Users?

If all goes according to plan, and providing that you keep up-to-date with releases of your preferred viewer, this actually shouldn’t mean very much in real terms. There are however a number of things to be aware of:

  • If you use a viewer that is not updated to use the new code (i.e. the official viewer 1.23.5 or a viewer that is not updated to use Current Outfit folder and / or to support the new bake request message / HTTP texture fetch mechanism) OR you continue to use an old version of a viewer rather than updating, there will come a time when your avatar  – and those around you – will not bake correctly
  • There are two issues that may occur during the transitional period when both the “current” and the “new” baking methods are in issue:
    • When teleporting or crossing between regions that use the different methodologies, users will experience their avatar rebaking, as the viewer will effectively be using two sets of data for the bake process
    • If there are two adjacent regions, one of which is uses the current avatar bake process and the other is using the “new” baking service viewers in one region will not be able to correctly resolve the textures of avatars in the other region
  • It is hoped that the transitional period where both methods of avatar baking are active will only last for about two weeks.

Object Caching and Interest Lists

Commencing at 57:25 into the recording.

When you enter a region at the moment, your viewer receives a huge amount of information on what requires updating, much of it relating to things you can’t even see from your position in the region. The data is received in no particular order, with the familiar result that things appear to rez in your view in a totally random order – quite often with the thing you actually want to see being one of the last to rez due to the mechanics of Sod’s Law. What’s more, if you have previously visited the region, the chances are that much of the information being sent to your viewer is already cached.

Object caching and interest list changes: easing the pain of random rezzing

The focus of this project is to optimise the data being sent to the viewer, information already cached on the viewer and the manner in which that data is used in order to ensure it is used more efficiently so that things rez both faster and in a more orderly manner than is currently the case.

At this point in time, this work is in a greater state of flux than the HTTP library and avatar bake projects. This is more a process of optimisation both on the server-side of things and within the viewer itself, rather than that of new functionality within the viewer per se. There are no general time frames for this work at present, but there will be updates once things become clearer as to how the optimisation is going to be addressed.

Time Frames

The precise timeframes for implementing these changes have yet to be properly defined. However, Oz Linden hopes that there will be at least a two month period between Linden Lab making the code for each of these project elements available for integration by TPV developers into their viewers and the point at which the Lab states the code must be in use.

At the moment it is likely that the HTTP library element of the project will but rolled-out first, although this is unlikely to be within the next two months, for the reason given above. Project Sunshine, dealing with avatar baking, will then follow after that – or although how soon after has yet to be determined; as described earlier in this article, this will be a very controlled roll-out. It is possible that the object caching / interest lists part of the project many not be rolled-out for another six months. However, timeframes are still in discussion within LL, so any of this may well change.

Expect updates on all three of these project elements as and when more information is supplied by Linden Lab.

Related Links

A Shining announcement: major improvements coming to SL

Yesterday Linden Lab announced a major series of new initiatives aimed at improving the overall SL experience. The announcement came via a Tools and Technology  blog post, which covers the initiatives in great detail. These focus on four main areas of activity, one of which is directly related to hardware and infrastructure, and the remaining three are focused on the platform itself and are grouped under the Shining project banner.

The hardware / infrastructure element of the work is described thus:

This year, Linden Lab is making the single largest capital investment in new server hardware upgrades in the history of the company. This new hardware will give residents better performance and more reliability. Additionally, we are converting from three co-locations to two co-locations. This will significantly reduce our inter-co-location latency and further enhance simulator performance.

The Shining project is something that is already known to many SL users – especially those who attend some of the User Group meetings. It is perhaps most famously associated with the Lab’s work on the Viewer rendering code, removing outdated functions and calls no longer supported in modern graphics systems (most notably Nvidia) and improving graphics handling overall. Shining has also been responsible for other incremental improvements to issues around streaming objects and avatars.

Under the new initiative, Shining is split into three core performance projects.

Bake fail: a familiar problem for many

Project Sunshine: One of the biggest complaints from users in SL is related to avatar rezzing. This can appear slow, and usually manifests in avatars remaining grey for periods of time, or in skin and system clothes remaining blurry (see right) – and at its worst, result in a user changing their avatar’s outfit – but others either seeing the avatar still dressed in the previous outfit or naked. Collectively, these issues are known as “bake fail” and are the result of the Viewer having to do all the compositing of avatar textures locally, then sending the results to the SL servers, which then send the information back to the simulator the avatar is in to be accessed by other Viewers in the same simulator.

Under Project Sunshine, to precis the blog post, much of this work is moved server-side, using a new, dedicated server, the Texture Compositing Server, which is separate to the simulator servers. This effectively allows all the “heavy” communications and calculations work relating to avatar texture calculations to performed within LL’s servers and across their own internal network, removing the reliance upon the Viewer and on Viewer / server communications which are outside of LL’s control.

Object Caching & Interest Lists: This is intended to directly address another common request from users: improving how the Viewer handles local object caching. This effectively means that once the Viewer has information relating to a specific region, and providing the information is still valid (i.e. there have been no changes to objects that the Viewer already has cached), then it will no longer need to re-obtain that information from the server. Only “new” or “changed” data needs to be streamed to the Viewer. This should mean that on entering a previously visited region, the Viewer should immediately be able to start rendering the scene (rather than requesting a download from the server), while simultaneously requesting any “updates” from the server through a comparison of UUID information and timestamps.

HTTP Library: The final aspect of Shining’s three-phase approach is to improve the underpinning HTTP messaging that are crucial to simulator / simulator and simulator / Viewer communications (and thus key to the other elements of Shining) through the implementation of “modern best practices in messaging, connection management, and error recovery”.

Overall, Shining will be tackling some of the major causes of Viewer-side lag and user frustration in dealing with avatar bake fail and the complexity and wastefulness of scene rendering that is encountered when moving around SL.

No definitive time frames for the improvements have been forthcoming with the announcements – and this is understandable; there’s a lot to be done and matters are complex enough that LL will want to proceed with minimal disruption to the grid and to users. Doubtless, more information will be made available as becomes known through the LL forums and (possibly more particularly) via the relevant User Groups.

Lagging behind the times

Frank Ambrose – FJ Linden blogs on the technology side of SL; and for once I have to wag my finger at him. This is a rare occurrence for me, as Frank is one of the most straight-up and openly “honest” (for want of a better word) Lindens who posts on the blog. But this time, part of his post does not reflect the realities of the SL experience.

Before getting to the wagging however, getting feedback on the technology side in SL is always good – and to be sure, Frank has led the charge behind the scenes in making the infrastructure a  lot better and more reliable; and some of the news he brings is good. Specifically, it is good to know that Group limits will soon be increased to 40 (albeit with a caveat). Raising the current limit has long been one of the highest-rated requests from residents for as long as I’ve been back in SL. That it has taken so long to get around to is a little inexcusable – but it is very welcome news. I just hope it doesn’t mark the return a familiar trend of LL opting for “easy” fixes (and yes, I do appreciate more is required under the hood than flicking a switch to achieve this) that amount to throwing crusts to the crowd in appeasement in the hope we’ll overlook the bigger and more painful issues.

HTTP texture loading is also good news…and one hopes that all TPVs will be able to absorb the code sooner rather than later. I’ve already commented on the Mesh beat, so no need to dwell on that; same with Display Names.

Improvements resulting from Project snowstorm I’ve yet to experience. I use a TPV, and I think it will still take a while for benefits from Snowstorm to flow outward, rather than inward. I’m sure there are a lot of LL-side benefits from the new server deployment process, but the truth is many have yet to seen real benefits in terms of their overall SL experience; but we’ll give it time.

No, the “wait, what?!” reaction to FJ’s post can be found in the first of his “update paragraphs”, namely:

Here’s  an interesting factoid: there are about two million teleports in Second  Life every day. Previous to our recent release of Server 1.42, when an avatar teleported or crossed into a new region, everyone on the  destination sim would experience a “lag” event as the simulator stalled  while processing the incoming avatar. This was often experienced as  “jitter” on the sim, especially evident when many avatars arrived at the  same time, such as for a live event. In the new simulation code, this  slow point has been moved to a separate thread. Our simulator  performance profiling tools show that this lag pain point is almost  entirely gone, greatly improving performance for highly trafficked  regions.

WUT?? Frank, shame on you. If you really believe that this lag pain point is almost entirely gone, I can only suggest that you and your team need new “performance profiling tools” – or better yet you need to get your little pixelated bums inside SL and try Tping around the grid for more than one or to attempts.

What has happened is that the pain point has simply shifted – not gone; and rather than giving a self-congratulatory pat on the back, you could at least admit that while overall sim freezing *has* improved, lag and tp issues are still prevalent and need further investigation. Issues such as:

  • Avatars universally arriving in mid-air and getting stuck for anything up to 5-10 seconds, unable to land or fly
  • Avatars freezing immediately after landing
  • Nearby avatars *still* experiencing a (albeit momentary) lock-up when someone tp’s in nearby in a crowded sim

Over an above this, and while not the focus of FJ’s article, lag in general remains a major headache within SL, with many residents reporting it to be at least as bad as pre-1.42, if not worse.

The sim-wide freezing  – down to a Mono issues – has gone by-and-large; and this is worthy of pointing out. But to use it as a blanket to cover the wider issues is not neither fair, not what we expect from you FJ, and it rather undermines the rest of the positive news contained in your blog.