SL projects Updates week 34/2 server, texture / Mesh CDN, group chat

Santaurio, Cala del Barronal; Inara Pey, April 2014, on FlickrSantaurio, Cala del Barronal (Flickr) – blog post

Server Deployments week 34 – Recap

There was no deployment on the Main (SLS) channel on Tuesday August 19th. All three RC channels received the same server maintenance project  on Wednesday August 20th, aimed at fixing a crash mode.

There may be news on the crash mode fixed in the RCs once it has deployed to the Main (SLS) channel in week 35.

Upcoming Server Deployments

There will be a new server maintenance package deployed to the RCs in week 35, which includes a couple of visible changes:

  • SVC-2262 – “Incorrect height value in postcard which sent from above 256m” (a postcard being a snapshot sent to e-mail)
  • A “re-fix” for BUG-6466 – “Numbers expressed in scientific notation and include a plus sign in the exponent are not parsed as JSON numbers by LSL”, which was thought to have been fixed a while ago, but which in fact resulted in BUG-6657 – Valid JSON numbers like 0e0 no longer valid after 14.06.26.291532. prompting the original fix to be rolled back.

SL Viewer News

TPV developer Niran V Dean has been working on a new unified snapshot floater which encompasses the “standard” floater, plus the Flickr, Twitter and Facebook upload options, under STORM-2040. Those who use the most recently releases of Niran’s Black Dragon viewer will be familiar with the approach, as he initially worked on the idea in that viewer.

NiranV Dean has been working on a more unified snapshot floater, which is currently undergoing testing. It will hopefully appear soon
NiranV Dean has been working on a more unified snapshot floater, shown here in the viewer window, which is currently undergoing testing. It will hopefully appear soon (click for full size)

The work is now progressing to a point where we should be seeing the fruits of his labour in the near future. In the meantime, I’ve previewed the work as it stands at the moment.

Texture and Mesh Fetching CDN

As reported in my last TPV developer meeting update, the Lab is looking to move texture and mesh fetching to a content delivery network (CDN). If successful, this approach will see texture and mesh fetching bypass the simulator entirely, being routed instead directly between the viewer and asset servers via the CDN, which should see improvements in the speed and reliability of such transfers.  Explaining the new approach further at the Server Beta meeting on Thursday August 21st, Maestro Linden said,

The viewer currently fetches all meshes and textures through the sim, which gets them from the asset server, and the sim gives the viewer a ‘capability’ URL, which the viewer uses for fetching. However, with this change to use the CDN, the sim instead gives the viewer the URL of the CDN, and the CDN has hosts all over the world. 

It’s good for two reasons: 1) the sim isn’t burdened with texture/mesh transfers; 2) you’ll often have lower latency to the CDN than the sim, which means more speed.

Allowing for the fact the Lab is accessing the service through their own network (although the CDN is a commercial service), both Maestro and April Linden report it as being a noticeable improvement on things, with texture and mesh fetching having double the performance compared to the current means of fetching via the simulator. Initial results of testing from Europe show similar improvements.

The Server Beta meeting agenda has further information on new method for mesh and texture fetching, including details on the Aditi stress test regions (one for textures, one for meshes), and those wishing to try them out are invited to do so. No special viewer is required in order to carry out testing at present, and the agenda includes notes on what to do. Note that the test regions are set to no-build so that people don’t rez extra things that would skew results; they are also likely to be limited in terms of the maximum number of avatars able to access them at one time.

If / when the new approach is more broadly rolled-out, people will be able to see which service (CDN or via the simulator) they are using is to set Develop > Consoles > Capabilities Info To Debug Console. Those using the CDN will see the GetMesh, GetMesh2, and GetTexture URLs will all be  http://asset-cdn.aditi.lindenlab.com/, while those using the current method will see it give some URL pointing at the sim host. However, we’re still some way from seeing the new service deployed further than the Aditi test regions.

Should this work prove successful, and once it and other HTTP work such as pipelining, as being developed by Monty Linden, is completed, the Lab hopes that they’ll have a fast, robust series of HTTP services such that they can look to retire UDP texture fetching – although this will be some way down the road, and in the interim, UDP will offer people something of a fallback for texture fetching should they have issues with HTTP as the various new services are deployed.

 Skill Gaming Regions

Simon Linden reported that the first of the Skill Gaming regions has arrived on Agni (the main grid). Called Crunchy, it doesn’t have any gaming parlours or anything on it, but appears to be set-up for testing (such as accessibility). There are a few things going on there, most under the control of Gecko Linden. Also, and as pointed-out by Simon, the first skill games and operators have started to appear on the Lab’s Skill Gaming Participants wiki page.

Group Chat

Work has resumed on group chat after a brief pause, and the Server Beta meeting saw a very brief test take place. The aim of this was to test delays that have been introduced into the members list updates sent by the chat server.

As I’ve previously reported, one of the biggest issues of chat delays in group chat sessions is to do with the numbers of updates the chat server has to send as people join / leave session and log-in / out of SL, changing their online status within the groups they’re a member off. Recent changes to the code are intended to queue these updates and reduce the load they are placing on the servers, interrupting the flow of text messages.

The test was brief, but appeared to give Simon Linden enough information to be able to go back and poke at things some more.

Group Chat Server Issues

There have been further reports of group chat servers at times becoming non-responsive. This issue was initially raised in week 33, after the server supporting all group chats with a key starting with “b”. A further issue was identified at the start of week 34 affecting the server supporting all group chats with a key starting with “d”. While the Lab is aware of ongoing problems, there is also a request for JIRAs to be submitted on specific issues.

SL projects update week 31/2: viewer, group chat

Matoluta Sanctuary, Sartre; Inara Pey, July 2014, on FlickrMatoluta Sanctuary, Sartre, July 2013 (Flickr)

Server Deployments Week 31 – Recap

  • On Tuesday July 29th, the Main channel was updated with remaining recent feature changes and bug fixes previously deployed to the RC channels – release notes
  • There were no RC deployments.

SL Viewer

The Zipper viewer, offering a faster install, reappeared on Wednesday July 30th, after vanishing from the Alternate Viewers wiki page in May. There are apparently an issue with the XUI Preview Tool being broken, which has now been resolved.

The new version of the viewer – 3.7.13.292263 – appeared as a release candidate in the release viewer channel, rather than a project viewer, where it resides with the group ban viewer and the library refresh viewer, both of which were updated in week 30, are likely the strongest candidates for promotion as the next de facto release viewer.

Group Chat

Testing of on-going group chat updates took place during the Server Beta meeting on Thursday July 31st.

Simon Linden is once more digging into the group chat code
Simon Linden is once more digging into the group chat code

As noted in a previous report, one of the major causes of issues with group chat lies not with the actual messages being sent back and forth, but rather as the chat server tracks who in online or not. The server maintains a list of who is online and in the group chat at a given moment, and is constantly updating the list as people join / leave the session; these updates are then sent to everyone else still active in the group, which interferes with the sending / receiving of actual messages.

“Imagine a popular group with, say 120 people online,” Simon said during the meeting. “Let’s guess the average online time is an hour … and that number varies widely, as there are a LOT of people who are connected for only a minute or two, maybe just checking IMs, see who’s online, or trying to fix something. But with 120 people … that’s very roughly an update every 30 seconds [14,400 updates an hour], sent to the whole group.”

Not only does this impact the sending / receiving of chat messages within the group, it can also impact other group chat sessions which are running on the same back-end server, as they are being starved of resources.

The code being tested on July 31st had been set to delay the sending of these updates from the server in order to see if it improves the throughput of actual messages. The downside of this is that the member list updates are somewhat delayed; however, this would seem to be a small price to pay in order for an increase in the reliability of messages actually getting through the system. As it is, the delay is configurable, so Simon was gathering data to see how the updated code works in terms of people joining / leaving chat sessions and sending messages. The results are liable to be known next week.

One possible future option for group chat is for people to be offered the ability to opt-in or out of receiving group chats until such time as they join a group chat (some TPVs already have an option to disable group chats until such time as you opt to join them).While this may help with the “I’m here!” messages sent to all groups on log-in, and which exacerbate the problem somewhat (again as described in the update linked-to above), such an approach is not seen as optimal, as it is possible users won’t change their behaviour, but will simply opt-in to all group chat sessions anyway.

Simon has also been tracking down an odd bug with joining a group and being able to open a chat session with it. “It’s really an odd one where opening the group is very slow or times out,” he said at the July 31st meeting, “and then can be immediate the next time you try. From what I can tell the chat server isn’t getting the messages … so somewhere between the viewer, simulator and chat server it gets lost.”

 Other Items

BUG-6736 is a feature request for the updating or removal of the current limit on the distance at which objects can be linked (see linkablility rules). The advantage in increasing the limit is that it could allow for bigger builds (in terms of footprint) without having to rely on scripted rezzing systems. A problem here is the if increased, there is a risk that the ability to link objects over greater distances might cause issues were said distances are close to or exceed the draw distance / interest list distance.

“You will get some really funky update issues if the link size is larger,” Simon said at the Server Beta meeting. “As soon as it gets close to your draw distance, things go bad, as in … you stumble against something you can see.”

Commenting on this, Lucia Nightfire added, “I noticed a selection bug/gripe with multi selection and one prim being out of interest range on rez, you deselect everything by clicking on something else then if you pull your cam back you magically select stuff that was out of your interest range.”

Responding to this, Simon said, “Yeah, that’s the kind of thing that can get confusing, you won’t see what you expect because the root might be farther away than your draw distance … That said, I understand the builder desire to make larger parts, but those limits are there because it can conflict with the interest list logic about your updates.” As such, it would seem unlikely that there will be much in the way of change to the linkability limit.

SL projects update 30/2: server, viewer, group chat

The Bayou, April 2014; Inara Pey, May 2014, on FlickrThe Bayou, April 2014 (Flickr)

Server Deployments – Week 30 Recap

  • On Tuesday July 22nd, the Main channel was updated with the infrastructure project deployed to the Magnum RC in week 29, and which adds support for the upcoming changes to the Skill Gaming policy, and includes the updates previously on LeTigre and BlueSteel- release notes
  • On Thursday July 24th, all three RC  channel were updated with the infrastructure support for the upcoming changes to the Skill Gaming policy, and the updates previously deployed to LeTigre and BlueSteel – release notes.

SL Viewer

The Library Refresh viewer was updated to release candidate status with the release of version 3.7.13.292194 on July 23rd. This viewer contains an update to a large set of the libraries used by the viewer to provide security, stability and consistency improvements to this and future viewers.

Group Chat

Simon Linden: continuing to work on group chat issues
Simon Linden: continuing to work on group chat issues

The anticipated group chat test didn’t materialise at the Server Beta meeting on Thursday July 24th as a result of Simon Linden coming across a last-minute issue which needed to be resolved ahead of further tests. He and Oz did, however, explain some recent discoveries within the chat system.

“While the earlier update to group chat didn’t give us any significant performance boost, we got a lot more information out of the servers,” Simon said. “And what we found was a big part of the group chat system load is not the chat messages you care about, but the updates to who is in the session or not.”

Oz added, “Those updates happen whether or not you’re displaying who’s in the session, in every group you’re in.”

Simon continued, “You can actually see this in the viewer if you add a line of code to log something whenever an update comes in to tell you who’s in the group chat … you’ll be surprised how many you get. The load goes up as the group size goes up … with a larger group, people are joining and leaving more often, and there are more people to update.”

People joining / leaving a session are recorded by the chat server. “It has a list of who’s online and in the group chat at that moment,” Simon explained, “it’s adding and removing from that list, and [generating] the resulting updates, that are the problem, [causing them] to be sent to everyone else still active in the group as they do so.”

The growth curve of these updates is described as exponential, and there is a knock-on effect with them as well; as group chat sessions share server resources, it is possible that a large group chat session, with multiple users joining  / leaving it and thus causing it to generate lots of updates can affect other group chat sessions hosted on the same group chat server, slowing them down as well.

While the chat servers are due for a hardware change, which is hoped will improve performance to a degree, simply adding more hardware to the chat service back-end isn’t seen as a solution, as it’s the exponential manner in which the updates grow which needs to be reduced and controlled. The testing Simon had hoped to run during the Server Beta meeting was to test some improvements he had been making to the queuing of the updates and in combining messages to hopefully reduced the load. However, in running over the code, he encountered a glitch that he needs to resolve before the testing can proceed.

Another issue with the group chat system is that when users log-in to a Second Life session, they automatically join all 42 of their groups, sending a “I’m here!” message to all 42 groups so that they can start receiving messages from active groups. This has obviously been exacerbated each time the limit on how many groups a person can join has been raised, so as Oz pointed-out during the meeting, “upping it again would make it even worse, so until this is dealt with, don’t even ask… 🙂 .”

HTTP Updates

As indicated by Monty Linden at the last TPV meeting, there are further server-side updates which should further assist with improvements to texture and mesh asset downloads. These are now nearing the point where they are likely to be surfacing (although quite when isn’t clear), prompting Oz to comment, “We’re setting up some experiments with server side changes that will complement the pipeline viewer, but are not strictly speaking dependent on it. When we’re confident that our test setup is ready, including how to measure the results, we’ll invite you folks to help us test.”

 

SL project updates week 26/2: group chat

Server Deployments Week 26 – Recap

  • On Tuesday June 24th, the Main channel was updated with the inventory / AIS v3 project, previously deployed to BlueSteel, which requires the current release viewer. See the release notes for more
  • On Wednesday June 25th, the three RC channels were updated as follows:
  • BlueSteel and LeTigre received a new server maintenance project with the new LSL functions to view and modify materials (see my notes) – release notes
  • Magnum remains on the Experience Tools project, but should additionally receive the inventory / AISv3 update deployed to the Main channel – release notes

Group Chat

Simon Linden is again working on improving group chat, with further tests being carried out during the Server Beta meeting on Thursday June 26th. Currently, the emphasis is on further improving reliability when engaged in group chat and moving between regions (either via teleport or directly by crossing between regions). A couple of people reported their chat windows appeared to freeze a lot less when switching between group chat sessions or following a teleport. Whether this was actually the case or a placebo effect is unclear, as Simon indicated he couldn’t see why it might be any different at this point in proceedings.

Other Items

Magnum llAttachToAvatarTemp Bug

An odd bug has been discovered on Magnum, which may be related to the Experience Tools code. It is defined in BUG-6438, “Objects attached via llAttachToAvatarTemp to object owner detach when script is removed from prim inventory”.

Essentially, using a script in an object which uses llAttachToAvatarTemp to attach an item to attach itself to the creator of the object, and which uses llRemoveInventory(llGetScriptName()) to remove the script from the object, results in the object itself detaching and being deleted. If the object is used by someone other than its creator, it will attach and the script will be correctly removed without detaching the object as well. It’s not clear if this happens with objects with multiple scripts in them or not, as it has only been tested against objects with the temp attach script in them.

Investigations are continuing into a fix, but in the meantime, it is believed that the Magnum code won’t be moving to the Main channel in week 27.

 ALM and Viewer Log Spamming

There is a viewer rendering issue, which can make itself particularly known when using the LSL functions for materials, where the face of an object will not be rendered, and the viewer will receive a lot of log spam (see BUG-6187). While things got sidetracked so he couldn’t expand on things, Maestro Linden did indicated at the Server Beta meeting that the issue is continuing to be looked at.

SL projects updates 19/1: SL viewer, group chat and miscellaneous things

Server Deployments

There are no scheduled simulator deployments this week to either the Main or RC channels, and so no associated rolling restarted expected.

SL Viewer

The Interest List RC finally made it to the de facto release viewer with its promotion on Tuesday May 5th (version 3.7.7.289461). This leaves just three RC viewer in the release channel at present: SL Share 2 project viewer version 3.7.7.289497; Sunshine / AIS v3 RC  version 3.7.7.289441; and the Maintenance RC viewer version 3.7.7.289405. Please refer to my Current Viewer Release page for up-to-date information on all viewer releases.

 Group Chat

Simon Linden’s optimisation work for group chat was deployed across all of the back-end chat servers on Monday May 5th. while these should see some improvements in group chat (particularly in sending / receiving chat and moving between regions), Simon does warn that these optimisations are not expected to “fix” all of group chat. However, he will continue to work on further improvements as well.

Other Items

New Starter Avatars

Ebbe Altberg used one of the upcoming new starter avatars at the VWBPE conference in April (image: Strawberry Singh)
Ebbe Altberg used one of the upcoming new starter avatars at the VWBPE conference in April (image: Strawberry Singh)

During his appearance at the VWBPE conference in mid-April, Ebbe Altberg appeared using one of the new starter avatars. At the time he did, it was hinted that the new avatars would be appearing relatively imminently. However, almost a month on and they have yet to officially appear, although there is some speculation they’ll do so in May.

these new avatars are said to take advantage of some of the latest features in SL, which is being taken to mean that some / all are full or partial mesh. This has in turn raised questions as to whether it is wise giving new starters full mesh avatars, given they may not work with freebie items often offered to or picked-up by new starters.

LSL Functions for Materials

While there is no confirmation any work is being carried out on this (except, as Simon quipped, “perhaps in a parallel universe or something”), the Lab is still sounding out how and where such calls would likely be used, and the frequency with which such calls would be made.

The option of having scripted control of materials has been debated often, and still remains a desired item among builders and scripters. However, some of the concerns still remain – notably, have such capabilities might end up causing performance issues, deliberately or otherwise. Much has already been written on how rapid map flipping on multiple objects could deliberately impact performance and potentially result in viewer crashes, plus there are already animated mesh elements available which can also have a significant impact on viewer performance (some types of animated mesh tail can reportedly overload a viewer on a 32-bit system with out-of-memory errors in a matter of seconds), so there are also concerns that were this to be combined with the ability to change textures via script, they could (even unintentionally) have further dramatic impacts on performance.

One way around this would be to throttle the rate at which material maps can be changed via scripted command. What is interesting for the moment is that the Lab appears to have not completely closed the door on scripted control of materials, but is considering options and informally seeking feedback on potential use cases.

 

SL projects updates 18/2: group chat; group bans

Server Deployments Week 18- Recap

  • On Tuesday April 29th, the Main channel received the server maintenance project that was on the Magnum RC in week 17, comprising a fix for BUG-5533 and a crash mode fix.
  • On Wednesday April 30th, the Magnum RC had the server-side Sunshine  / AIS v3 code re-enabled (this code requires the use of the Sunshine RC viewer), and all three RCs were updated with the bug fixes deployed to the Main channel.

SL Viewer

There have been no updates to the RC viewers in the release channel during week 18  and no further releases, either RC or project viewers, so the SL viewer releases remain as per the last update to my Current Viewer Releases page.

Group Chat Optimisation

Simon Linden dancing at a Server Beta User Group meeting
Simon Linden dancing at a Server Beta User Group meeting

The code Simon Linden has been working on to improve group chat was deployed to a single group chat server, where it has been running for all groups starting with group_id “b”. Commenting on the work at the Server Beta meeting on Thursday May 1st, Maestro Linden said:

Simon’s been looking at the performance of that group chat server, and it seems to be running fine. So there are plans to update the rest of the group chat servers to the new version early next week. We won’t go so far as to say that group chat has been totally fixed, though – Simon has identified some other changes which could improve performance further.

A recent fix was made to IM sessions to correct the issue where it is possible to see “typing…” in an IM window when the other person isn’t actually typing (see STORM-1975), and questions were raised on whether this fix might be adding a load to group chat sessions, as the viewer-side code appeared to send the message during group chat sessions as well as person-to-person IMs. However, both Simon and Maestro Linden indicated that the notifications are simply ignored by the chat servers during a group chat session, so no additional load is created, although Maestro admitted it would be nice if the viewer didn’t send meaningless messages.

Aside from the back-end load, the biggest issue which occurs in group chat is when someone using it changes regions. When this happens, the chat service has to figure out where you are.

“The region you are on, your viewer and the back-end database all know where you are, and keep updated very fast. The chat servers, however, aren’t kept in perfect sync because that would be very hard to do with 50000+ people moving around who are all in 42 groups,” Simon said, in explaining the problem. He added, “It [the chat service] doesn’t track missing messages … but if it can’t send one to your agent, it then has to ask where you really are and then it sends there,” all of which takes time, delaying the receipt of group chat messages.

In describing the changes made, Simon concluded:

The new code we have out now is a bit more efficient, but more importantly it has more metrics and it showed me that the performance problems are in a few other areas … it turns out the updates to keep the list of people in the group chat updated are really significant. It gets worse, of course, in large groups – more people coming and going, and more people who need the updates.

Group Ban Lists

It appears a server-side deployment of the code require to manage the new group ban list functionality is drawing close. There is still work to be done viewer-side, but recent testing on Aditi resulted in a number of JIRAs being filed, and the associated server bugs have been stomped on by Baker Linden, with help from the likes of Caleb Linden.