SL project news week 50/2: server and viewer

Server Deployments Update

The RC channel deployments went head as scheduled, and included the promised fix for offline IMs from scripted objects failing to reach people’s e-mail (BUG-1002). A further issue (BUG-1027) with group owners receiving garbled messages on ejecting group members was reported over the weekend of the 9/10th December, and this also received a fix which formed a part of the deployments.

As reported in part 1, Magnum received code to double the server-side memory allocation from 60KB to 120KB. Animations within SL have two core limits: loop time (30 seconds) and memory allocation (60KB). Apparently, some complex animations which ran within the 30-second time frame have been hitting the memory allocation limit. This initial change should help to ease that issue when encountered. A future change on the viewer-side of things should eventually increase the animation run-time as well, allowing for animations longer than 30 seconds to be uploaded and used.

Server Deployments Week 51

Given Magnum has all the same changes as both BlueSteel and Le Tigre, plus the additional stability improvements and memory leak fixes, it now looks likely that this code will be promoted to the Main Channel on week 51 (week commencing Monday 17th December).  No news was provided during the Server Beta meeting on any proposed RC releases during week 51.

Again, as a reminder, there will be no server code releases during weeks 52 (commencing Monday 24th December) and week 1, 2013 (commencing Monday 31st December). There will also be no further Server Beta meetings until Thursday January 3rd, 2013.

Viewer Updates

Linden Lab continue to work on the beta and development viewers in order to clear the backlog of releases resulting from the memory leak / crash issues. Currently, they are “almost” at the end of catching-up on the release schedule. Some of the focus at present is on the Mac side of things, with Oz Linden reporting that there should be “A bunch of changes for the Mac build and the Mac implementation coming into viewer development” over the next few days.

The beta viewer has seen a further 3.4.3 code release (3.4.3.268139 on December 14th). This should mark the last of the 3.4.3 code releases for the beta viewer prior to that code moving to the release viewer, possibly in week 51. After this, the beta viewer will move to the 3.4.4 code base, which will include the changes for the Mac side of things as well. This should then see the viewer branches all more-or-less back to a normal pace of development and update, with fresh releases on the order of every three weeks or so, including more HTTP service updates and improvements.

Tcmalloc has been set off to one side in order to clear the backlog, but “has not been forgotten”. Currently it is still enabled in the beta release, but appears to be disabled in upcoming viewer development versions.

Avatar Baking

The biggest news of the week came with the announcement that for Avatar Baking, the countdown has commenced. This is going to take the next few months to implement, and requires both changes on the server-side of things and significant changes to the viewer. An update had actually been promised at the Content Creation User Group on Monday 17th December, but given the large impact the changes have on viewers, Nyx Linden rightly announced the news relating to the project at the TPV Developer meeting on Friday 14th December.

Nyx linden discusses server-side baking at the TPV Developer meeting, Friday 14th December
Nyx linden discusses server-side baking at the TPV Developer meeting, Friday 14th December

Threaded Region Crossings

The work on multi-threaded region crossings is still with the LL QA team. In the meantime, further regions have been added to the simulator version (server code  DRTSIM-184) running the new code. Four of the latter are GC Test 9, GC Test 10, GC Test 15 and GC Test 16, which form a block of four regions which may assist with testing the capability (remember these SLurls are all to Aditi!). Caleb Linden has been testing the capability and reports that he has encountered some issues himself, with crashes during “automated horde testing” and with repeated crossing with heavy scripts. He’s interested in hearing constructive feedback from anyone willing to carry out informal tests on the code.

Related Links

Avatar Baking: “and the clock has started!”

Update, April 6th, 2013: Please also see my updated status report.

The new avatar baking project took a step closer on December 14th, as LL started to release more in the way of technical details on the project and launched a project viewer.

Avatar bake fail
Avatar bake fail

Code-named Project Sunshine, and a part of the Shining Project, this work is aimed at improving avatar baking and at eliminate bake fail issues.

The project represents a sizeable change in how Second Life works, and as such will take time to fully implement, requiring extensive changes to the viewer itself – something which Nyx Linden has previously referred to as, “Some pretty scary viewer re-architecting”, as well as a good part of the back-end services – hence why it has taken so long for the project to mature. Because of the degree of changes taking place, Linden Lab have consistently promised, via Oz Linden, that TPV developers would some eight weeks notice prior to any initial deployment of the new service in order for them to ensure they can integrate the required viewer-side code changes, test them, and ensure they can support the new service.

Speaking at the TPV Developer meeting on Friday 14th December, Oz reiterated the 8-week lead time before adding, “Today begins the clock! … You get at least two months from now before we begin rolling server-side baking out to the main grid, at least beyond a test region or two.” So while the precise timescale as to when the new baking service will start to appear on the main (Agni) grid remains open, TPVs can now start to engage in the project, a step which itself brings it one step closer to reaching the grid.

A Quick Recap: How It Is and How It Will Be

Currently, avatar baking is essentially driven from the viewer. In summary (and without drilling too much into detail), this means that when a system layer outfit or item of clothing is changed (including alpha layers), the updates are applied locally in the user’s viewer. They are then uploaded to the server the user is connected to, which then passes the updates out to the other viewers connected to it, so that other users get to see the change as well. This process has several points of potential failure: communications between the viewer and the server may be interrupted, for example, with the result that the server doesn’t receive all the information pertaining to an outfit change, with the result that  – again as just one example – the user sees their avatar perfectly fine, but others see the avatar as blurred / grey. In some instances, the process can fail such that while the user sees their avatar wearing the desired outfit, other see the same avatar still wearing the “old” outfit.

The new service will hopefully eliminate these issues by moving much of the emphasis for the baking process from the viewer to a new “Texture Compositing Service”. The viewer will retain some elements involved in avatar baking – the actual baking of the avatar shape (i.e. shape values and IDs) will still take place on the viewer side, for example. However, the new compositing service will take over most the donkey-work and handle the majority of avatar baking data and communications (excluding prim-based attachments).

As with many of the new services being introduced into Second Life by LL, the new baking service will be HTTP driven (the current system is UDP protocol based) which should have an additional benefit of speeding up the entire avatar load process when logging-in to SL and in fetching textures.

How the entire process should work can be summarised as follows:

  • The new service will use the Current Outfit folder as its viewer-side driver. This means that in order to use the service a viewer must have the Current Outfit folder properly implemented
  • When a rebake request is due (e.g. after a user has finished editing their appearance) the viewer sends a message to the baking service essentially asking it to look at the contents of the viewer’s Current Outfit folder and then return an updated appearance based on the contents of that folder
  • At the same time as the data is returned to the user’s viewer, it is also sent to the simulator to which the user’s viewer is connected, so that the simulator can send the appearance information to all other viewers connected to it.

To further TPV developers understand the new system and answer their questions,  Nyx Linden dropped by the TPV developer meeting on Friday 14th December. Note that what follows is an overview of Nyx’s discussion from the point-of-view of providing digestible information on the new service for “general” users, rather than a in-depth review of the full technicalities of the system and Q&A session.

Nyx linden discusses server-side baking at the TPV Developer meeting, Friday 14th December
Nyx linden discusses server-side baking at the TPV Developer meeting, Friday 14th December

Please use the page numbers below to continue reading this article

SL project news: week 50/1: Server, JIRA, mesh and Shining

Server Deployments

Due to the offline e-mail issue involving scripted objects, as reported in my last news update, there has been no Main Channel deployment this week. Two RC deployments are currently planned for Wednesday 12th December, however. These are:

  • BlueSteel and LeTigre: should receive the same maint-server project that rolled to Magnum in week 49, with bug fixes arising from that deployment. The release notes are available for review
  • Magnum should receive a superset of the changes scheduled for BlueSteel and LeTigre, which includes extra bug fixes, including stability improvements and a memory leak fix.  
    • The only new feature new to Magnum is an increase in the allowed animation asset size – the 60KB size limit on animation assets has been raised to 120KB. This change is to allow for longer and more complex animations to be made in the future, once an viewer-side update to allow 60-second animation loops has been implemented. Magnum’s release notes can be read here.
So be sure to read them :-) (with thanks to Whirly Fizzle for the link)
So be sure to read them 🙂 (with thanks to Whirly Fizzle for the link)

Update on Key Region Issues

Physics Memory / Region Performance

As reported last time, the physics memory issues affecting some regions, which I reported in week 47, had been tracked down by Simon Linden to a Havok issue related to navmesh rebakes. His fix for this problem cleared QA and forms a part of the RC deployments for the 12th December, together with a fix for a low-level threading problem within the simulator code which has also been causing region crashes.

Offline IMs from In-world Objects Failing to Forward to E-mail

This issue, linked to llInstantMessage(llGetOwner(), caused the RC deployments in week 49 to be rolled back on Thursday 6th December. A fix has been developed and tested and is included in all three RC deployments planned for Wednesday 12th December.

Code Freeze / No Change Windows

Again, to re-iterate from my last report, there will be no server-side code changes over the holiday period as follows:

  • Week 52  – commencing Monday December 24th
  • Week 1, 2013 – commencing Monday December 31st

Simon Linden still hoped that one of the code being deployed to the RC channels this week can be rolled to the Main Channel in week 51. There will likely be a further update on this following the Thursday Server Beta UG.

JIRA / Bug Tracker Update

Linden Lab are still mulling the September closure of the old public JIRA system. Since the initial shut-down, things have opened up a little. Additional JIRAs have been left open as read only beyond the initial triage, while others have been opened and have had their comments enabled in order to allow feedback – such as the CHUI JIRA, which is being very constructively used for comments and feedback and shows how, in an ideal world, the system might work.

Currently, it appears that “nothing definitive” has been decided on the change, although it has been under internal discussion.

Feedback from those in the two JIRA support groups (developers who have significantly contributed code and those who have in the past supplied significant support in handling JIRAs) has been interesting. It appears that the number of feared duplicates on issues has been a lot smaller than had been feared. The overall quality of input given using the new form also appears to have been significantly improved since it was introduced.

Continue reading “SL project news: week 50/1: Server, JIRA, mesh and Shining”

SL project news week 49

SL Viewer Updates

Things have been relatively quiet viewer-wise with only two updates to the official viewer branches. The development viewer rolled to 3.4.4.267614 December 4th, while on December 5th, the beta viewer rolled to 3.4.3.267755. The latter included a good crop of updates, including a number of graphics and GPU support related changes, and the long-await snapshot tiling fix.

Rough the same image shot using the new beta viewer at the same resolution  - no tiling line (click to enlarge)
The MAINT-628 fix in action: an image taken with the latest beta viewer running in deferred mode and at a resolution of 3500×2154 pixels, well above the 1400×900 of my monitor – and no tiling! There are limits to how well the fix works at ultra-high resolutions, as noted in the JIRA comments included in my report on the release.

Server Deployments

Following-on from last week’s RC deployment issues, there was no main channel deployment on Tuesday December 4th, although a number of regions were restarted during the course of the day.

Wednesday December 5th saw the same maintenance release rolled to all three RC channels. This comprised the release originally aimed at Magum in week 48 and which included all bugs fixes for the problems which required the roll-back on Thursday 29th November. Initial statistics for this update during the brief time it was available last week showed a clear improvement in stability, and this seems to have continued with this week’s release, although there has been one major issue come to light and is under investigation.

This relates to IM messages sent by scripted objects failing to trigger e-mails to the object’s owner if they are off-line. The problem appears to be related to the use of llInstantMessage(llGetOwner(), and appears to affect regions on all three RC channels, but not every case where llInstantMessage(llGetOwner() is used appears to be affected.

Currently, it is thought that a fix will be available for deployment during week 50, and should reach the RC channels om Wednesday December 11th.

Details of the week’s RC release can be found in the release notes and in the forum discussion thread (including some discussion on the current scripted object / e-mail issue).

Continue reading “SL project news week 49”

SL project news: week 48/3: Interest List update

Andrew Linden continues to forge ahead with his initial work on Interest Lists, which forms a part of the Shining Project. As reported last time around, the issue of people’s HUDs appearing on other people’s screen has been fixed, and the code is currently with LL’s QA for testing.

Given the progress made, Andrew re-capped on the project / gave further insight into this initial phase of the work Server Beta User Group meeting on Thursday November 29th.

Object Updates

The first phase of the new interest list code is aimed at reducing the amount of information being sent to the viewer by the server. This done through the code only sending required updates to the viewer for objects which are within the camera’s line-of-sight. Essentially, updates on objects take three forms regardless of which Interest List code is being used:

  • A “full” update – required when an object is “seen” for the first time or which is constantly updating position / appearance
  • A “terse” update – required only when an object is changing appearance / position relative to the viewer’s in-world view
  • A “please delete” update when the object has been removed from the viewer’s in-world view (e.g. it has been deleted or taken back to inventory).

The new updates are aimed specifically at the number of “terse” updates being sent to the viewer. Under the current Interest List code, these updates are continuously sent out by the server to all viewers in range of an object in motion or undergoing change, regardless as to whether what is being updated (in terms of movement or appearance) is actually visible in the window of the viewer. With the new Interest List code, updates are only sent to your viewer based upon what is actually in your world view.

This means, for example, that if you can see a bouncing ball on your screen and then turn your avatar or camera so it is no longer visible to you, under the old system, data packets relative to the ball’s motion continue to be sent to your viewer, even though they are no longer required. With the new system, the updates cease shortly after the ball moves off your screen, and only resume as the ball moves back on to your screen once more (no actual content is broken by this change, it is simply a change in the amount of updates being sent from server to viewer).

The net result of this is to reduce the amount of data the server is sending to the viewer, thus helping improve performance. As the new code applies to both objects and avatars, this can amount to a substantial improvement, as Andrew commented during the meeting, “I think we’re currently seeing a 30% improvement (in the time spent in “Agents” in the stats) for the case of about 30 avatars running around in a region with 12k prims”.

However, there is a side issue with these changes, which Andrew and the devs are currently looking into. If you turn away from a moving object for a length of time such that the terse updates from the server are no longer sent, then suddenly pan the camera / turn so the object is once again in view, there might be a brief delay (one second, currently) before the object correctly updates. This is because the viewer is rendering the object based on its “old” data received from the server before getting the latest updates from the simulator. The time delay can potentially be reduced, but doing so can negatively impact the overall performance gains made. Because of this, Andrew is holding it at one second in order to ascertain how much of an impact the delay actually has as the code is tested.

Camera Follow

Currently, everything you see in-world is almost exclusively based on its position relative to your avatar, regardless as to where you move the camera. This is why, as you cam further away from your avatar, objects may appear with less and less detail, or may not render at all (particularly smaller objects) – their respective level of detail is being calculated based of their distance from your avatar, not from their distance from your camera.

With the new Interest List code, what you see in-world is now based upon the position of your camera. The benefit of this being that as you cam around, objects should render at the correct level of detail relative to your camera (so no more sculpts which appear to be stuck at a lower LOD despite your camera hovering a few metres away from them, for example). The difference between the two approaches can be seen in the image below.

Interest List in action: in the top image, the current problem. The interest list code is based on the avatar position. I'm standing 100m from a .5-cube with a 0.001 cube on it. When zoomed in on the cubes, the smaller one remains invisible (and I vanish from view). In the bottom image is the same set-up using the new interest list code. The 0.001 cube is now visible when I zoom in via camera, and I'm also visible, over 100m away.
Interest List in action: In both images, I’m standing 100m from a .5-cube with a 0.001 black cube on top of it. When I cam out to the cubes using the existing Interest List code (top image), the black cube fails to render, due to the level of detail sent to the viewer is based on my AVATAR’S position, despite my camera only being a metre or so away from the small cube. However, under the NEW Interest List code (bottom image), the small black cube is rendered, because the level of detail being sent to my viewer is now based on my CAMERA’S position (click to enlarge)

Continue reading “SL project news: week 48/3: Interest List update”

SL project news: week 48/2: RC issues, region performance and Aditi issues

Server Deployments Week 48

After a smooth deployment to the main channel on Tuesday 27th November, things got a little unsettled on Wednesday 28th November with the deployments to the RC channels. As noted in part 1 of this report, these were supposed to comprise a maint-server release to BlueSteel and LeTigre, with the same package and a few extras going to Magnum.

The problems started during the actual deployment on Wednesday, wherein after successfully updating Magnum and BlueSteel, the deployment team started noticing issues unrelated to the deployment which caused Coyot Linden to call off the LeTigre roll-out until things were sorted. However, as Maestro Linden takes up the story:

Then there were reports in the forums about offline IM emails from objects being broken; if an object sent you an IM and you were offline, the offline email would contain all the usual details *except* for the message. This bug affected both BlueSteel and Magnum since they both shared the responsible change. Then, after digging into offline emails a bit more, we noticed that the ‘To’ field of offline emails would show the object owner’s name instead of the recipient’s name … which was a little confusing … Anyway, these bugs were kind of bad, but we weren’t sure that they were worth the trauma and downtime of an emergency rollback …But then this morning, we became aware of a 3rd bug, from support. It turned out that deeding parcels to groups was failing.

It was this third bug which was deemed sufficiently serious enough to warrant a roll-back of the RC deployments, which took place on Thursday 29th November, with the result that all four channels are now running on the same release – 12.11.09.266804. Kelly Linden has fixes for all three issues, but they are currently in testing, and the plan is to try again next week with the RC deployments.

Region Performance / Memory Issues

The physics memory issues which I reported in week 47, and then provided an update on in Part 1 of this report have received further attention from Linden Lab. The problem has been with some regions experiencing severe physics memory bloat within a short time of being restarted, with the result being that they rapidly reach a threshold of memory use (~230MB for homesteads, ~920MB for full regions) which prevents rezzing of any objects, in-world or attached.

In investigating the issue, Simon Linden located a source of memory leak related to the Havok system which may address the issue, and is hopeful he has found a fix. Commenting on the matter at the Server Beta meeting on 29th November, “I’m in the midst of hacking a special test mode for a region … I’m going to make it continuously re-bake the navmesh and terrain data. I’ll let that run overnight and see what happens … I’m not going to claim victory quite yet … this is like that point where you hit the zombie hard, but you have to see if it comes back again.”

Aditi Grid Log-in Issues

It had been hoped that the Aditi situation, wherein problems with inventory-related data is preventing people from being able to log-in to the beta grid – would be discussed at the Server Beta meeting, but these were somewhat side-lined by discussions on other projects, most notably Interest Lists (work on which prevented Andrew Linden from digging into the matter following the Simulator User Group meeting on Tuesday 27th November).

While Maestro was able to confirm that a member of Linden Labs is, “Working on a script to help the Aditi inventory situation”, little more information was provided at the meeting due to other ongoing conversations. Hopefully, this matter will be picked up next week.

PATHBUG-183

In working on Interest Lists, Andrew Linden had hoped he’d sorted a fix for PATHBUG-183, which relates to offscreen physical objects flying across your in-world view. The code for the fix has been available on Ahern, on the Aditi grid, but there are reports that while the problem may have been somewhat reduced in scale, it is still very apparent. Andrew hopes to look into this during week 49.

Region “Flicker” and  Content “Warping”

There has been some issues related to viewing neighbouring regions and crossing region boundaries. These issues take a number of forms, and are related to both server and viewer issues:

  • The entire neighbouring region “flickers” if you are near its edge
  • After crossing between regions, the content of the region you have just left seems to appear in the region you’ve just entered, usually somewhat warped / deformed
  • “Cachable” objects (e.g. unscripted and non-physical) objects vanish from your view of a region on leaving it
  • A region appears to “reset” (re-renders) itself shortly after leaving it
warp-4
Sometimes on crossing between regions, objects from the region you’ve just left seem to appear warped and deformed in the region you’ve just entered…
...only to disappear after a few seconds / as you approach them
…only to disappear after a few seconds / as you approach them

The first two of these issues are considered to be viewer-related bugs which are thought to have been resolved in code currently in the viewer development. The final item in the list is related to a server issue which Andrew Linden believes to have been fixed as a part of his work on Interest Lists. The “cacheable” objects problem was also considered to be a viewer-side bug which has also been addressed in viewer development, but in considering the problem at the Server Beta meeting, Andrew Linden thought there might actually be more than one bug responsible and indicated he would be looking into this some more.