A Shining announcement: major improvements coming to SL

Yesterday Linden Lab announced a major series of new initiatives aimed at improving the overall SL experience. The announcement came via a Tools and Technology  blog post, which covers the initiatives in great detail. These focus on four main areas of activity, one of which is directly related to hardware and infrastructure, and the remaining three are focused on the platform itself and are grouped under the Shining project banner.

The hardware / infrastructure element of the work is described thus:

This year, Linden Lab is making the single largest capital investment in new server hardware upgrades in the history of the company. This new hardware will give residents better performance and more reliability. Additionally, we are converting from three co-locations to two co-locations. This will significantly reduce our inter-co-location latency and further enhance simulator performance.

The Shining project is something that is already known to many SL users – especially those who attend some of the User Group meetings. It is perhaps most famously associated with the Lab’s work on the Viewer rendering code, removing outdated functions and calls no longer supported in modern graphics systems (most notably Nvidia) and improving graphics handling overall. Shining has also been responsible for other incremental improvements to issues around streaming objects and avatars.

Under the new initiative, Shining is split into three core performance projects.

Bake fail: a familiar problem for many

Project Sunshine: One of the biggest complaints from users in SL is related to avatar rezzing. This can appear slow, and usually manifests in avatars remaining grey for periods of time, or in skin and system clothes remaining blurry (see right) – and at its worst, result in a user changing their avatar’s outfit – but others either seeing the avatar still dressed in the previous outfit or naked. Collectively, these issues are known as “bake fail” and are the result of the Viewer having to do all the compositing of avatar textures locally, then sending the results to the SL servers, which then send the information back to the simulator the avatar is in to be accessed by other Viewers in the same simulator.

Under Project Sunshine, to precis the blog post, much of this work is moved server-side, using a new, dedicated server, the Texture Compositing Server, which is separate to the simulator servers. This effectively allows all the “heavy” communications and calculations work relating to avatar texture calculations to performed within LL’s servers and across their own internal network, removing the reliance upon the Viewer and on Viewer / server communications which are outside of LL’s control.

Object Caching & Interest Lists: This is intended to directly address another common request from users: improving how the Viewer handles local object caching. This effectively means that once the Viewer has information relating to a specific region, and providing the information is still valid (i.e. there have been no changes to objects that the Viewer already has cached), then it will no longer need to re-obtain that information from the server. Only “new” or “changed” data needs to be streamed to the Viewer. This should mean that on entering a previously visited region, the Viewer should immediately be able to start rendering the scene (rather than requesting a download from the server), while simultaneously requesting any “updates” from the server through a comparison of UUID information and timestamps.

HTTP Library: The final aspect of Shining’s three-phase approach is to improve the underpinning HTTP messaging that are crucial to simulator / simulator and simulator / Viewer communications (and thus key to the other elements of Shining) through the implementation of “modern best practices in messaging, connection management, and error recovery”.

Overall, Shining will be tackling some of the major causes of Viewer-side lag and user frustration in dealing with avatar bake fail and the complexity and wastefulness of scene rendering that is encountered when moving around SL.

No definitive time frames for the improvements have been forthcoming with the announcements – and this is understandable; there’s a lot to be done and matters are complex enough that LL will want to proceed with minimal disruption to the grid and to users. Doubtless, more information will be made available as becomes known through the LL forums and (possibly more particularly) via the relevant User Groups.

33 thoughts on “A Shining announcement: major improvements coming to SL

  1. Third version of this news I’ve read this morning (after Tateru and the LL forum itself), and I still read “compositing” as “composting”…

    Like

    1. Now, now!

      I actually missed the blog post coming out last night, despite being aware Friday is the day LL seem to announce news. Was a bit of a wet-fish slap when I scrolled through the blog listing this morning. Should’ve checked again last night before bed! :).

      Like

  2. And once many of the viewer-side issues are moved to the servers, the web based interface can be accomplished.

    Like

    1. Excellent point 🙂

      Although LL said they’d be deploying Havok on the client-side as well, to offload some server-side issues to the clients, so I wonder how they’ll tackle that on the Web…

      BTW there is a web-based interface for SL: http://www.builtbuy.me/ It works nicely.

      Like

        1. Hmm. Their system now over quota (it usually is in the GMT late afternoon) so I cannot check it out… Tipodean’s viewer is built upon Unity3D, which will require installing the Unity3D plugin for your web browser. Is that the “irrelevant product” you’re mentioning?

          It was working fine in the morning, though 🙂

          Like

  3. Is there any indication if or when LL will address the issue of lag at busy events and perhaps raise the limit of the number of avatars a sim can support?

    Like

    1. In 2004, when the limit was merely 40 avatars, we used to say: “in about 2-3 years”.

      It took LL a decade to go from 40 to 100 avatars, and we all know 100 avatars is really just wishful thinking.

      My best guess now is “in about 20-30 years” unless LL gets some sense in their bright but stubborn minds and asks Intel to implement Distributed Scene Graph on top of LL’s technology which deals magnificently with no lag and 800+ avatars on a single region. Intel’s software is free.

      What are the chances of that happening in our lifetimes? About as high as a snowball in Hell…

      Like

      1. Hmmm. That’s really interesting, Gwyneth. The Intel link mentioned OpenSimulator, so does that not imply that could be easily back-ported to Second Life?

        Like

        1. No! 🙂 The two technologies are utterly incompatible on the simulator side! And LL is not going to “give” access to their own technology to anyone else but IBM (and probably not even that, since I presume their agreement with IBM has expired…), so that’s why I think it’s “extremely unlikely” that LL will ever use Intel’s amazing technology on their own servers…

          Like

          1. While I totally agree that it might have been the major reason for (mainstream) business and media to have lost interest in Second Life, I cannot agree with you on the education and/or training/simulation aspects: there are at least two reputable academic journals publishing academic articles related to SL in its (almost) entirety, and who knows how many more who regularly publish work done by academic researchers in SL. During the “golden years” of SL, it was only with a lot of reluctance that academia accepted a paper about SL for publication; these days, indexing databases and things like Google Scholar are flooded with academic articles on SL to the point that it’s hard to find anything, there is so much variety…

            Not to mention that, these days, hardly any conference related to education fails to have at least a presentation about work done in SL. It might be more surprising to see completely surprising fields like history and archaeology also including speakers presenting their work in SL (well, and OpenSim too). There are still SL-only conferences around. To be honest, except for conferences on games, you don’t see any conferences about virtual worlds flooded with presentations about SL (games have other platforms to address).

            “IMVU”, as an example, gets 1200 hits on Google Scholar — many of which are just mentioning IMVU side-by-side with Second Life — and OpenSim/OpenSimulator don’t fare much better. “Second Life”, by contrast, gets 43k hits! — three times as much as World of Warcraft.

            This coming from someone who, in the same week, submitted a book for publication, an article for a journal, and a presentation for a conference… all about SL. Granted, I might be a tad biased…

            So, no, on the education front, SL is anything but forgotten. In fact, you could argue that pretty much everything has been forgotten except SL, where research thrives, and way more than a few years ago…

            Like

        2. Perhaps you’re right, Gwyneth – you’re clearly better informed than me. However, I did mean SL as a platform for Education, rather than people writing about SL.
          I have a developer friend who is actively involved in using SL technology as a platform for eduction (as in virtual classrooms and the like) and they have moved completely to OpenSim as using their own server in a closed system just makes far more sense for a classroom than SL itself.

          Like

          1. Those people “writing about SL” are mostly writing about the work they do in SL 🙂 When they talk about using SL for education, they describe their experiences (or the experiences they’ve registered) in doing education classes in SL.

            Nevertheless it’s quite true that, these days, a lot more is being done on OpenSim as well, but for only one reason: cost. These days it’s next to impossible to get enough funding to support a long-term research facility in SL just for doing classes. There are obvious exceptions but far less than before. Still, from discussing issues with educators, it’s really just the cost that has moved them to OpenSim, and rarely any other issue (with some marginal cases where the ability to have control over registrations is important, or where adults need to teach younger students in a less “bureaucratic” environment). Most educators are actually sorry to have to forfeit the vast amount of available content in SL and have to work with the little that can be found on OpenSim. And, of course, a very large educational project which intends to attract a vast audience will have to deal with the complex issue of hosting it on a commercial OpenSim grid, where they will have stability and good support, but which are closed to external visitors and will only be able to reach a few scattered thousands; or host it on a HyperGrid-enabled grid, but which will rarely have the same kind of performance and stability offered by commercial OpenSim grids (not to mention SL!). It’s a tricky decision, but it’s clear that the main reason is really, really, just the high cost of tier in SL.

            In my country, for example, for the cost of two SL islands you can give a grant to a full-time graduate student to set up a campus grid with as many islands as you need for your project. For the cost of four islands you can get a full-time PhD student! When evaluating the funding needs, it’s far, far easier to get support to pay for highly qualified human resources to work on your project than to get a handful of islands leased to LL…

            Like

        3. … and now I should have read your comment after replying!! hahahaa I’m so sorry, I’ve just made a complete fool of myself 🙂 And I totally forgot about that article… which is actually still rather pertinent. There is just one thing that has changed in the past two years: I was over-optimistic about the adoption of HyperGrid to interconnect all OpenSim grids. In fact, the larger a commercial OpenSim operator grows, the more likely they will isolate their grid, just like LL does (and for the same reason). So while all islands on OpenSim, added together, actually represent an equivalent size to LL’s SL Grid, this number is misleading, as 1) the number of active users is about 1/30th of SL and 2) all those islands are scattered among literally hundreds or even thousands of tiny grid operators, most of which are not accessible via HyperGrid. That means no content transfer among them, and the need to register the same avatar over and over on all those tiny grids, and trying to bring your inventory with you as grid operators fail and you have to move elsewhere. In 2010 I naively expected that OS grid operators would realise that to “compete” with LL they had no chance but to join forces and offer a “metaverse” — a federation of several interconnected grids, where a single login and a single inventory would be able to visit myriad islands on hundreds or thousands of small operators.

          This is not going to happen.

          Thus, when for an educational project, visitors are important, the only choice is really LL’s SL Grid, in spite of its prohibitive costs…

          Like

  4. Although LL has said “the biggest investment this year” they were careful not to give any precise dates on when we could expect some of these changes to actually be deployed… which is consistent with their policies 🙂

    I guess that the best news is that at least for the next half-year we have some roadmap, which is better than “no news at all” or “no commitments, no comments”.

    The new hardware and higher-density of sims-per-server (thus the reduction in co-location facilities from 3 to 2: more sims-per-server means less server renting costs and less bandwidth allocation, which means running the same grid for a lower cost) might lead to enough cost savings so that LL might be considering thinking about changing the pricing structure — probably not an outright tier decrease, but a more complex calculation involving Land Impact, which might be confusing at first. We’ll see. It’s no coincidence that this “announcement” comes with Cloud Party’s open beta getting so many exposure in the SL media (although I had just logged in and saw that just a dozen people were around… not exactly exciting for a weekend); they might wait for Cloud Party to announce their prices first to see how they will compete with them.

    Making the cache work properly is most definitely a very welcome change. It’s stupid, I know, but I had seriously suspected that objects were not really “cached” in the sense we usually employ the word. This announcement tends to confirm my suspicions. Of course, textures are a bigger problem than objects (the difference in the new cache might make a higher impact on meshes!), but I noticed earlier today that the new setting to compress textures in the graphics card actually produces surprising results on my ancient hardware (while I was expecting rather the reverse) — at least during GMT morning & afternoon.

    Like

  5. Gotta love how Linden Lab makes bug fixing their product sound like new features. very glad they at least decided to communicate these bug fixes, lets get some more transparency and more updates on the progress of this, they are way too slow when it comes to implementation..

    Like

  6. As with any announcement, I’m giving this the “Acta, non verba” treatment.

    Would be nice, but until I see it happen, not going to get all SQUEEEEE! or GRRRRRR!

    -ls/cm

    Like

    1. Best course of action to take.

      Will be doing the same and attempting to watch progress over coming weeks / months.

      Like

  7. I’m a bit confused. I always assumed to reduce load on the simulator by letting the viewer (client) do it (bake your avatar). It seemed like a simple way to off load some tasks to the client that really is NOT that complicated even for the most slowest PC to do. The real problem with bake fail is the utter mess their congested internal network is, an what a failure HTTP texture gets are. But at least they are addressing this too.

    However, reducing the co-locations down too two would seem even more congestive networking nightmare than it already is. It’s not really about distance and how far away each co-location is to the master asset server, it’s narrow bandwidth cumbersome un-optimized cluster F** within their infrastructure. If anything dropping one co-location and consolidating with remaining two is simply a cost cutting measure.

    Like

    1. I was a bit confused too, but this might make some sense. Currently, to bake all textures on an avatar, the viewer has to grab a dozen textures or so, bake them locally, upload to the sim it’s connected to, and hope that everybody sees what they see. So all avatars on the sim need to get that texture first, which means that on a lagged sim it will be happily trying to send baked avatar textures to everybody else, failing, trying again, failing, etc…

      Using this new model, if I understand correctly, the avatar texture gets baked on the server and sent to Amazon’s cloud to be downloaded by everybody else. Until you change clothes again, everybody will be retrieving your texture from Amazon and not from the sim. Of course, when you do change clothes, the sim you’re on will need to bake the textures again and send a new copy to Amazon. So the trick will be to make sure you look great at home before hitting a laggy party 🙂 (well, isn’t that what we all do iRL as well??).

      This might go a long way to avoid very laggy sims to deal with all the trouble of sending baked textures to everybody (100 avatars on a sim mean that at least 10,000 textures have to be sent).

      I agree that consolidating on just two co-location facilities is mostly a cost-saving measure 🙂 But LL has always had problems with multiple co-location facilities, it’s been a plague… HTTP textures & objects & map & who knows what else, however, run all from Amazon, where there is no congestion. So I can imagine that at some point LL will reduce everything to a single co-location and push whatever they can to the cloud — making sims track only avatar position and local chat and little else. While I’m sure that cost reduction is the primary goal, the more things are pushed to Amazon and to the viewer (think Havok and handling physics locally), the less burden will be placed on the sims…

      Still, if I were LL, I would run the sims on the cloud too 🙂 But that’s another story…

      Like

      1. “Until you change clothes again, everybody will be retrieving your texture from Amazon and not from the sim. Of course, when you do change clothes, the sim you’re on will need to bake the textures again and send a new copy to Amazon.”

        Not necessarily.

        The Compositing Server will apparently have access to a database of generated textures which can be re-used rather than any new server-side calculations having to be performed. What’s more this will apparently be done regardless as to the avatar originally wearing the outfit.

        Thus, if you change in “outfit X”, and the Compositing Server has data for that outfit in its database – even if previously worn by “Avatar Y” – it can simply pull the information from its database and send it to the server.

        I assume this is operating on some form of UUID look-up based on system layers, rather than simply recording entire outfits – otherwise how will it handle people’s preference to wear, say a shirt layer from one outfit, a pants layer from another from another outfit, etc., in order to avoid any unnecessary calculations?

        As an aside, and *if* I’m understanding the process correctly, I do wonder if this might not also be – and while it might be a long time in seeing the light of day – the first step in overhauling the asset management system? This is obviously complete, non-technical speculation on my part, but the implementation of the Compositing Server does seem to raise further ancillary questions.

        Like

        1. Oh… aye, I might have not understood how the Compositing Server works. I thought it would mostly bake a texture for a specific avatar and put it into the cloud.

          But I don’t understand one thing. If “outfit X” has been worn by “avatar Y”, then how can “avatar Z” benefit from a pre-baked texture, since the skin is part of the baking process and Y and Z wear different skins?

          Like

          1. I agree that skins have had me scratching my head, as they form the “base” layer for baking, and so there be some re-calculation going on – hence why I agree with some commentators that this is “black box” (but not necessarily with the contention that it is merely “maintenance”.

            Hence my speculation regarding the asset Db. In reference to the Compositing Server, I’m pretty much assuming that by referring to it as “separate from the sims servers”, LL may mean it is “sitting between the viewer / simulator and the asset database.” If you follow that.

            Like

        2. From what I can glean from the still very unfinished and still changing sunshine repo, the compositing server will be entirely seperate from the sim “set the location of the Agent Appearance service, from which we can request avatar baked textures if they are supported by the current region”.

          The sim will tell the viewer it has ‘useServerTextureBaking’ implemented and the viewer will then fetch the bakes from the agent appearance service rather than the requestLayerSetUploads() service sims use now.

          Like

Comments are closed.