SL projects update week 18 (2): server releases

Deployments for Week 18

The week 18 deployments make for interesting reading.

SLS Main Channel

On Tuesday April 30th, the Main channel was rolled back to release 13.04.05.273580, as a result of a widespread performance issue.

Release Candidate Channels

  • BlueSteel and LeTigre: both of these channels should remain on the Experience Keys project, but will also be reverting some changes, due to the same performance issue which is affecting the Main channel – release notes
  • Magnum: should remain on the same server maintenance project as week 17.  This project brings some new minor features to LSL, and fixes some crash modes.  This update fixes the grid performance issue, and fixes an issue in which llDialog() messages sent to the object owner could be incorrectly throttled – release notes.

The performance issue which caused the Main channel to see the re-deployment of an older release was described by Simon Linden as being related to problems with regions locating their neighbours. “The performance problem was really showing up between any one region trying to locate another on the grid … the system was actually working, but too close to the cliff for comfort.” The re-deployment means that the new LSL AO capabilities can no longer be compiled / run on any Main, BlueSteel or LeTigre regions, until the supporting code is rolled-out once more.

There is still no further detail on the Experience Keys project and whether this may / may not be more than a deployment of the Advanced Creation Tools permission system.

Interest List Update

Andrew Linden has been working on fixing a bug related to Meeroos (but which I’ve seen affecting other animals as well), which he describes as, “If you turn your camera away from a crowd of Meeroos, wait several seconds, then turn back around… the Meeroos will be updated, but not quite in the right order. So sometimes you’ll see a head move to the new position, then a fraction of a second later the rest of the body.  So I have a theoretical fix that doesn’t crash the simulator (anymore).” The fix in question has yet to be tested and QA’d, so there is no time frame for any release.

“Missing” Prims

While talking about the interest list work, Andrew answered a question on missing prims / linksets, again acknowledging it to be a viewer-side issue, before going on to say:

We think maybe it is fixed in a new viewer. But this new viewer I mention happens to be very crashy, so we haven’t opened up the source code for it yet nor have we submitted it to our QA team since they’ll just crash …  This is the viewer that goes with our new interest list changes which I mentioned a few weeks ago and people were wondering when the code would be put up on a public repo.

"Missing" prims - viewer-side fix possibly on the way?
“Missing” prims – viewer-side fix possibly on the way?

So a viewer-side fix, along with viewer-side interest list updates, looks to be somewhere on the horizon.

Region Crossings

There have been a number of reports of region crossings worsening again after seeing a significant improvement with the release of the fix for BUG-1814. A common issue has been avatars becoming “snagged” at a region boundary while the vehicle they were travelling in continuing on its way, sometimes being returned to their Lost and Found folder from a location two or three regions beyond where the avatar became stuck. Both prim/sculpt and mesh vehicles are affected when the problem occurs, and it is an issue which had been encountered prior to the widespread deployment of the BUG-1814 fix.

Getting "snagged" at a region crossing while my aircraft flew on was a problem I encountered several times over Blake Sea early in April. The problem has again manifested itself to many, and I've again encountered it while flying my mesh Spitfire Mark IX
Getting “snagged” at a region crossing while my aircraft flew on was a problem I encountered several times over Blake Sea early in April. The problem has again manifested itself to many, and I’ve again encountered it while flying my mesh Spitfire Mark IX

I’d actually encountered the problem on April 4th during a series of region crossing tests, but the problem no longer appeared to be occurring by the middle of the month.

The issue of region crossings was raised at the Simulator Meeting on Tuesday May 30th, but the discussion was dominated by the problems being encountered by one particular type of train. In commenting more generally on region crossings, Andrew Linden said, “I agree that region crossing on vehicles needs more work. I can’t promise that I’ll be working on that as soon as I’m done with this interest list project, but I’ll try to bring it up in the next ‘what do we work on next’ brainstorm that we have.”

SL projects update 16 (2): Server releases; region crossings

Second Life Server (Main) Channel Week 16 Deployment

On Tuesday April 16th, the SLS Main channel received Monty Linden’s HTTP updates, which were deployed to BlueSteel and LeTigre in week 15, after having previously been on Magnum for testing.  These updates can be briefly summarised as:

  • More complete and more correct headers on texture and mesh fetches – these should ensure the viewer is better able to handle objects as they are downloaded to it
  • Keepalive connections for some HTTP-based services

For more details on the project, please refer to both the deployment release notes and to my overview of Monty’s work.

 Second Life Release Candidate Week 16 Deployment

On Wednesday 17th April, all three RC channels should receive the same update package. This comprises the server-side LSL Animiation Override capabilities, this time complete with a fix for BUG 2164, wherein the new capabilities could conflict with built-in animation poses in chairs, etc., as discussed in my week 15 updates.  This deployment additionally includes the slight region performance improvement when there are no pathfinding characters present. Release notes are available

Originally, a separate package had been in preparation for deployment to BlueSteel  / LeTigre, but this has had to be postponed due to “last minute scheduling issues”, according to Simon Linden when speaking at the Simulator User Group meeting on Tuesday April 16th. While attempts were apparently being made to get an alternative project into RC, it was “down to the wire to complete testing” at the time of the Simulator UG meeting, and an announcement confirming BlueSteel and LeTigre would receive the same package as Magnum was posted to the deployment thread not long after the meeting finished.

Object Return from Region Edge

A further update which should reach all three RC channels on Wednesday April 17th is the fix for https://jira.secondlife.com/browse/BUG-313 (estate tools do not return objects between 255 and 256m ) / https://jira.secondlife.com/browse/BUG-2021 (Auto-return not affecting objects at 256m), which see objects right on the region edge sometimes slipping into a “limbo” which prevented them from being returned either under Auto-return or when using estate tools.

There is some concern that the fix, once deployed, may not correct all issues. However, until it is deployed, there’s no actual way of knowing – so further updates may well be following.

Region Crossings

Since the deployment of the fix for BUG-1814 making region crossings in vehicles has been seen as noticeably better by many people. However, some have noted problems which appeared to be linked to crossings between regions running on different simulator versions, and this was discussed at length at a recent Simulator User Group meeting.

Kitto Flora suggested the problem was not so much with different simulator versions, but due to network traffic, commenting, “It’s directly related to your Net traffic rate when you cross. If its 500k – fail maybe 20% of time … If its 50k it rarely fails.”

While I have been flying extensively over the past week, particularly over Blake Sea and the south-lying regions and over parts of Nautilus, I’ve not been monitoring net traffic during my flights – although I do reduce Draw Distance when flying and tend to shunt graphics quality down to medium-low – so cannot comment on Kitto’s observations. I can however state that when I did encounter problems beyond the expected temporary loss-of-control  / rubber-banding – such as my camera skewing off to once side of my aircraft as shown below – it always coincided with a move between one simulator version and another, and never between regions on the same simulator version. So I guess more test and observations are due on my part after this week’s deployments!

Flight testing region crossings: when moving between regions running on different simulator versions, I invariably encountered greater issues (such as the camera being shunt, as shown above) than when crossing between regions on the same simulator (note the chat console reports, lower left and notifications. top right).
Flight testing region crossings: when moving between regions running on different simulator versions, I invariably encountered greater issues (such as the camera being shunt, as shown above) than when crossing between regions on the same simulator (note the chat console reports, lower left and notifications. top right).

The discussion on region crossings raised additional questions. One of these was whether or not the speed one crosses between regions made any difference. Simon Linden replied:

Your speed in-world shouldn’t have any effect on actually making it or not, but faster crossings will show the errors in predicting where objects will be more. Such as the rubber band effect when crossing … your viewer sees you going a certain speed, and keeps moving you that way, while you hit the crossing, get some lag as the data is transferred to the new region, and you’re stuck into the world, then sling-shot back to the new position. 

Questions / comments were also raised around the subject of region crossings and idle regions: specifically whether crossing into an idle region was subject to additional delay as the region “woke up” and that some have experienced issues with regions which are apparently idling being unresponsive to new child avies, and people “bounce” off the border prior to being able to cross. Responding to both the question and the comments, Simon said:

You actually shouldn’t ever be able to do that. It won’t be idling if you can see into it … Also, remember idle regions are not dead, they [are] just are running at a slower frame rate, just like loaded down regions do.

Missing Prims

There are currently no updates on the “missing prims” situation which has been previously reported in this blog, and which has grown markedly more apparent since the last set of interest list updates.

Andrew Linden was not at the Simulator User Group meeting on Tuesday April 16th to discuss either, but is almost certain to be asked at the Beta Server meeting on Thursday 18th April, if he attends.

Related Links

SL project news 8 (2): servers and issues

Update February 24th: Metabolt 0.9.64.0 (Beta) was released on February 23rd to address the issue of nearby objects not being recorded following theserver-side interest list updates.

Update February 22nd:  Radegast 2.8 has been released, which both provides support for server-side baking and resolves the interest list releated issue of failing to correctly report nearby objects noted in the first poart of this report.

Week 8 Main (Second Life Server / SLS) Channel Deployment

The expected Main channel (SLS) deployment took place on Wednesday 20th February, as anticipated. The SLS channel deployment came a day late as a result of the Presidents Day holiday in the US – Main channel deployments will be back to their normal Tuesday slot from week 9 (week commencing Monday 25th February.

Issues

  • There is a report that both the Radegast and Metabolt text clients are either incorrectly seeing local objects, or failing to see them at all, possibly as a result of the interest list code deployment.
  • There is a bug in the  interest list code which means that child prims within a linkset do not always render from a distance – you need to cam in towards them for them to appear, and they can then vanish again on camming away. A fix for this is expected in the “next round” of interest list updates.

Week 8 Release Channel Deployments

There have been signficant changes to the RC channel deployments occurring on Thursday 21st February (again a day late due to Presidents Day).

Due to a last-minute bug being found in Baker Linden’s large object rezzing code updates, the package intended for both BlueSteel and LeTigre has been pulled. Instead, these channels will receive the code targeted for Magnum, together with the updates made to the Main channel.

Diagonal Region Rendering Issues

I’ve reported on this on a number of occasions recently, and JIRA SVC-8130 is still open on the matter. The incidence of occurrences seems to be on the increase, with more people having anecdotal evidence of problems. Following his comments in the second part of my week 7 report, Simon believes a fix is in the works, with Simon reporting at the Simulator User Group meeting on Tuesday 19th February that it is, “With QA now.”

Missing regions: bug fix on the way ...
Missing regions: bug fix on the way …

Region Crossing Issues

Numerous problems are being reported with region crossings, some of which appear to be on the increase of late.

Content “Warping”

I first reported on this in week 48 of 2012, as one of a number of region crossing issues being reported. When crossing between regions, the contents of the region you’ve just left suddenly “warp” into the region you’ve just entered, appearing for a time and perhaps somewhat deformed.

When the matter was first raised, Andrew Linden thought it might be more a server-side issue than a viewer-side problem – although all the issues raised at that time seemed to be split between possible issues within the server code and the viewer code.

warp-4
Sometimes on crossing between regions, objects from the region you’ve just left seem to appear “warp” into the region you’ve just entered, and may appeared deformed…

In looking at the problem as a server-side issue, Andrew had hoped that the matter would be resolved via the interest list code roll-out. However, it has persisted despite the interest list code having been deployed to an RC channel, and is now largely regarded as a viewer-side issue. The status of any fix for the problem is unclear.

...only to disappear after a few seconds / as you approach them
…only to disappear after a few seconds / as you approach them

Camera Position Lost on Region Crossings

There has ben a return / increase in issues of the camera position getting “messed up” when crossing between some regions in a vehicle. This can take a variety of forms, most notably with the camera suddenly shifting to show a side view of the vehicle while driving, rather than the usual forward-looking view SVC-7684 provides the details, complete with a video demonstrating the issue. It appears that fixes for this problem have thus far been on a per-region basis, as there is no clear indicator as to the underlying issue – although it is recognised enough such that Motor Loon now incorporates a scripted workaround for the problem in lieu of a fix, which forces the correct camera positioning to be re-applied on a script crossing. The Lab have been trying to investigate the issue, but are having a hard time reproducing it in a consistent manner. Those routinely experiencing the problem on given region crossings should consider filing a JIRA specifying the issue and region crossings where it occurs and reference SVC-7684.

Increase in Crossing Disconnects and Other Issues

The Simulator User Group meeting on Tuesday 19th February saw an increase in the number of other region crossing issues, with people reporting several issues, ranging from a disconnect on attempting a region crossing through to an increase in teleport failures when moving between issues and getting stuck in the corner of a region and spinning during a region crossing. Again, it is unclear what may be causing these issues, some of which have been reported as increasing somewhat sharply over the last 2-4 weeks.

Sudden Region Lag

It has been a while since I last updated on this issue, as reported in both a forum thread by Toysoldier Thor and others and in a JIRA BUG (BUG-355). In sort, the problem remains, with live event venues reporting issues, and the Lab uncertain as to the cause. When the issue was raised at the Simulator User Group meeting, Simon Linden replied, “Hmm, that sounds like it needs a fresh look. It seems like there’s a spike in networking traffic, performance goes bad and thus all sorts of things don’t work well.”

Interestingly, the four regions of the recent OBR in SL event did not suffer any such issues, despite a constant and high volume of traffic, although these had apparently received additional attention from the Lab to help ensure problems might be minimised.

Server-side Baking Load Test

A final reminder that there will be a Server-side Baking load test on Aditi on Thursday February 21st, following the Server Beta meeting. For details, please see my original announcement.

SL project news week 6 (2): server deployment updates

Server Deployments – Week 6

The planned server deployments for week 6 occurred as anticipated:

  • On Tuesday 5th February, the Main channel received the server maintenance project deployed to LeTigre in week 5. This has miscellaneous minor bug fixes and new features – release notes
  • On Wednesday 6th February, the RC channels received the following:
    • BlueSteel: code for materials processing (project viewer still pending) – release notes
    • LeTigre: a new maint-server project to fix miscellaneous crash modes, and with minor performance improvements – release notes
    • Magnum: interest list code update to specifically address the bot / bandwidth problem reported on in last week’s update and also support for materials processing – release notes

Server Deployments – week 7

There is no advanced news on potential deployment for the week commencing Monday 11th February, 2013.

SL Viewer Updates

The beta viewer was updated on February 6th with the release of 3.4.5.270034. Please refer to the release notes for details of all changes and updates. The CHUI project development viewer also updated to 3.4.6.270114 on February 6th.

Updates – Issues and Other Bits

Bot / Bandwidth Issues

Speaking at the Server Beta user group meeting on Thursday February 7th, Maestro Linden indicated the ongoing bot / bandwidth issue related to the interest list code and as pointed to by Latif Khalifa and confirmed by Andrew Linden (reported in more detail here), appears to have been resolved. Commenting on the bug fix in the server deployment thread, Triple Peccable, who was one of those being badly impacted by the problem, comments:

Maestro and Andrew,

I wanted to report on the bot’s usage. Fixed!

Before this incident the bot’s “normal” usage was 5 MB / hr. That is so normal no one would suspect anything.

But now it is 1 MB / hr! It has never been that low before, ever.

The improvement might be from the interest list changes, but since the bot is parked 3300m up with a very limited draw distance, I think it is from this UDP bug fix, and will help with more than just bots. :smileyhappy:.

Estate Ban Issues

Two issues have been reported in relation to estate bans recently.

One is the use of LSL commands for estate moderation, as mentioned in the second part of my report for week 5. While it is not clear how widespread the issue is (the reports received so far appear to relate to four regions), it had been hoped that the code deployment to LeTigre might have fixed the problem, but tests with an affected region move to LeTigre showed this was not the case. However, Maestro Linden believes LL may have a match between the issue and a bug that was filed internally after  crash report fingerprints were browsed, so investigations are liable to continue.

In the second, Whirly Fizzle has reported an issue with the “GTFO” ban feature in Phoenix. While this adds the banned individual’s name to the banlist for an estate, the individual isn’t actually barred from accessing the estate. As such, it is thought that this issue might contribute to recent problems in people apparently circumventing estate bans, and is something which will not be rectified by the estate ban improvements currently being deployed by LL, as it is an issue within the Phoenix viewer code itself.

Region Crashes on Restarts

In addition to the restart performance issues related to physics memory use previously reported and updated in part 1 of this report, some regions are experiencing issues with the physics engine during a restart, with all scripting capabilities being disabled as the physics engine is overloaded. Scripting must then be re-enabled by the region owner / estate managers. A fix for this is being worked on, and should be available soon.

Vanishing Regions

Following the week 5 deployments, Alvid Majestic contacted me concerning issues with regions diagonally opposite Brocade, on the Mainland, failing to render in the viewer’s world view, and would not render until such time as a person moved into one of the regions immediately adjacent to it / moved into it.

Missing regions: Mullein and (beyond it) Ear fail to render from Brocade, which sits diagonally opposite them
Missing regions: Mullein and (beyond it) Ear fail to render from Brocade, which sits diagonally opposite them

This is not a new issue, having previously been reported in SVC-8130, although there was some confusion as to whether or not it had been resolved. Commenting on it in general at the Server Beta User Group meeting, Maestro Linden informed me, “It’s somewhat rare, but it was never officially fixed.”  As the JIRA is closed to comment, Shug Maitland has raised a forum thread on matter, so if you are witnessing the same issue on an ongoing basis, consider adding your comments there as well as raising reports.

Region Crossings

There has been mixed feedback to the results of the deployment of the new region crossing code across Agni.

Regular commentator on this blog, Wolf Baginski Bearsfoot has put together a report on his findings in the SL Server sub-forum, which builds on his initial impressions posted in this blog.

Some feedback given through the User Groups suggest that in some instances region crossings – such as with sailing – are improved, and at the Simulator User Group meeting on Tuesday 5th February, Simon Linden indicated LL were seeing fewer instances of stuck teleports.However, there have also been reports passed through the Server Beta group of automated cars on the Mainland encountering problems at region crossings while following Linden Roads and piling-up at the boundaries of regions such as Furness to Ravenglass, although instances appear to have calmed down. More updates on this as they come.

SL project news: week 2 (3): server, mesh and materials

Server Deployments – week 2

As noted in the update to part 2 of this week’s report, the planned deployment of two new releases to the RC channels didn’t go as anticipated. Originally, it had been intended that BlueSteel and LeTigre would received the new threaded region crossing code while Magnum would receive Andrew Linden’s interest list code improvements.

Interest List Deployment Cancellation

The interest list deployment was cancelled after the 11th hour discovery of some bugs with the code. Speaking at the Server Beta meeting on Thursday 10th January, Maestro Linden described the main issues as – ironically – being connected with region crossings, and with object updates.

In the first, anyone crossing between regions several times in a vehicle would experience all of their non-rigged attachments disappearing from their world view, with the viewer itself eventually physically detaching them. Not only did this cause confusion as to what was happening with attachments for those experiencing the issue, it also resulted in some avatars ending up naked following a relog.

Interest listcode: bugs led to deployment cancellation on Wednesday 9th January
Interest list code: bugs led to deployment cancellation on Wednesday 9th January

The second problem was slightly more complicated, if potentially more rare. In it, if User A had an object on the ground and User B looked at then turned their camera away such that the object was no longer on their screen, User A could then wear the object as an attachment and teleport away; however, when User B subsequently turned their camera back to where the object had been, they would still see it on the ground despite the fact it had been taken away. What happened if User A (now wearing the object) teleported back to User B wasn’t actually tested.

As a result of both of these issues and the cancellation of the interest list deployment, Magnum received the same region crossing package as intended for BlueSteel and LeTigre.

Region Crossing Code Issues

However, the bad news did not end there, as Maestro Linden explained, “After a few hours, we saw that the [region/sim] crash rate was way too high.” As a result, the threaded region crossing code was disabled via a configuration change to the servers without the need to rollback the release. Once this had happened, region crash rates returned to “normal” levels. In all the new region crossing code was active for around five hours before being disabled once more.

Analysis of the crash rate revealed it to be linked to avatars crossing to / from heavily scripted regions. While the new code was extensively tested on Aditi, the regions there were not excessively loaded with scripts during testing, and so the problem did not manifest. However, subsequent testing with the test regions running heavy script loads did result in them also crashing, confirming the problem.

At the time of writing, Kelly Linden believes he has a fix for the issue; if so, it should hopefully find its way to the RC channels in week 3, commencing Monday 14th January.

Continue reading “SL project news: week 2 (3): server, mesh and materials”

SL project news: week 1, 2013: forthcoming RC releases, viewer, and new work

RC Deployments for Week 2

The Lab is still getting back up to speed following the Christmas / New Year break, so expect further information to be forthcoming on Main and RC releases for week 2, 2013 via the Server topic of the Technology Forum.

However, as it stands, there are two projects which it is hoped will reach RC channel release in the week commencing Monday 7th January, 2013. These are Caleb Linden’s threaded region crossing code and Andre Linden’s interest list code.

As I’ve previously reported, the threaded region crossing code was subjected to a pile-on test on Aditi towards the end of 2012. The results were, on the whole, a little disappointing for those taking part – although expectations may have been set a little high. While there were some improvements noted – particularly when travelling between regions on foot and with a heavy script load – overall, there were sill issues with crossing between regions in vehicles (particularly ground vehicles).

Airbourne antics: vehicles still exhibited region crossing issues duing the threaded region pile-on test in December 2012
Airbourne antics: vehicles still exhibited region crossing issues duing the threaded region pile-on test in December 2012

Issues arising from the pile-on test are still being looked at, and Caleb repeated his request that anyone noting specific issues should raise a JIRA directed for his attention. For those wishing to try out the code, the GC Test regions are still available on Aditi.

The Interest List code is still subject to receiving an OK from the Lab’s QA team. There will doubtless be an update on this – and on the planned RC releases in general – at the Simulator User Group meeting on Tuesday 8th January, 2013.

SL Viewer News

Not a lot to report on here at present. The Beta viewer reached the 3.4.4 code base Just before Christmas 2012, (3.4.4.268497, December 20, 2012). There may be a rendering issue which may require addressing and might lead to a slight delay in releases; apparently, not all tests are giving the same results, so LL are still investigating the matter. Work is continuing to update the GPU tables for the viewer; further cards have been added to the table, and several blanket entries have been removed (such as all unrecognised nVidia cards being detected as nVidia Ion GPUs).

As reported over Christmas, CHUI rolled through a number of rapid releases in its development version, and the main project version rolling to 3.4.3.268587 on December 22nd. Both the development and project versions of the viewer are on the 3.4.3 codebase, and the most recent development release was made on January 4th (3.4.3.268703). Both versions are available from the Alternative Viewers download page.

While the core of the Mac version of the viewer is built using OSX 10.7 (with Xcode 4.3.3), work is progressing in moving the viewer to OSX 10.8 Mountain Lion, which is expected to happen “very soon” according to Oz Linden, although no date is available as to when.

New Pathfinding Capability

VoidPointer Linden is working on a new flag for pathfinding characters. STAY_WITHIN_PARCEL is designed so that when set, pathfinding characters will only set goal points during wander, evade, pursuit, etc that are within the parcel they get created in. If the parcel is a non-regular shape, it is still possible a character will cross between it and neighbouring parcels (unless the navmesh is cut through the use of an exclusion volume), but goal points will only be set within the originating parcel. The code is still in development, and so the constraints on where a character can wander when it comes to irregular parcel shapes, but VoidPointer is not making any promises on this.

He's completely batty! - Voidpoint Linde at the Server Beta UG meeting, Jan 3rd, 2013
He’s completely batty! – Voidpointer Linden’s avatar at the Server Beta UG meeting, Jan 3rd, 2013

There is no stated delivery time for this new feature, other than it is currently being worked on.

Server Object Rezzing Code

Baker Linden has been looking to improve how objects with large file sizes are handled by the simulator software when being rezzed. He describes the work thus, “What I’ve been working on is hopefully significantly decreasing lag spikes when rezzing large, complex objects. Large does not necessarily imply size, but size of the files being read. When an object is rezzing, we have to parse the object / mesh files and create our in-world objects with that data.”

Until now, reading and parsing of any files related to objects which require rezzing has been on the main thread. When several such objects requiring rezzing at the same time, the simulator stalls. Baker has been moving the reading / parsing operation to a background thread in the expectation that rezzing multiple “large” (again, in terms of file size, not the size of the object itself) objects will not choke the simulator.

The key point about this work is that it is specifically aimed at preventing the simulator processes from choking and a region stalling when there are a number of large object files being read / parsed, not at actually “speeding up” the physical rezzing process. As such, it is unlikely that objects will appear any faster in people’s in-world view as a result of this work. However, what it does mean is that the simulator code will be better able to handle rezzing multiple “large file” objects without the attendant region lagging which can occur as a result of the simulator being unable to process messages from viewers and other simulators, etc.

Information Sources

  • Opensource Developer meeting, Wednesday 2nd January, 2013
  • Beta Server meeting, Thursday 3rd January, 2013.

Related Links