Friday 19 December 2008
MUC update successful
The MUC service for muc1.virtual-presence.org has been updated successfully and works fine.
MUC update
We're updating the muc1.virtual-presence.org service... while that most of the virtual world is offline for a few seconds.
Thursday 18 December 2008
Points migration finished
The points migration has been finished but the Toplins page will stay offline for a few days.
Tuesday 16 December 2008
MUC server issue
The muc1.virtual-presence.org server has a connection issue, so most of the virtual world is offline.
We're working on it.
We're working on it.
Wednesday 10 December 2008
Points migration
At the moment we transfer the points to another code base so users may notice a difference between the points while the migration is running.
Tuesday 9 December 2008
Load balancer problems
Our load balancers have a problem with too many connections. Our website is not available at the moment, so several services are unavailable too, like login, wousle, contact list etc.
We're working on it.
Update: All services are unavailable...
2nd Update: All services are back online but there's still work to do with the load balancers.
We're working on it.
Update: All services are unavailable...
2nd Update: All services are back online but there's still work to do with the load balancers.
Thursday 27 November 2008
Contactlist online
Today, we've reactivated the contact list. For new contacts the new infrastructure is already running.
At the moment, we're migrating the old contacts to the new infrastructure. This will take some hours.
At the moment, we're migrating the old contacts to the new infrastructure. This will take some hours.
Wednesday 26 November 2008
XMPP Failure
One of our XMPP services failed and needed to be restarted. A part of the virtual world was unavailable for about 4 minutes.
Wednesday 19 November 2008
Moving Users Between Clusters
We're moving some users between the clusters again to balance the load between them. Users will detect a normal short reconnect when they have moved.
Thursday 13 November 2008
Moving Users Between Clusters
Currently we're moving some users between the XMPP clusters to balance the load between them. Users will detect a normal short reconnect when they have moved.
Wednesday 12 November 2008
Chat Server Lag
A room server has very high lag. We are investigating.
Update: only the connection between XMPP cluster 1 and one chat server lags.
Update: during the investigation the room server crashed due to strange memory conditions. After the restart everything is back to normal.
Update: only the connection between XMPP cluster 1 and one chat server lags.
Update: during the investigation the room server crashed due to strange memory conditions. After the restart everything is back to normal.
Scheduled XMPP downtime
We will have a scheduled XMPP maintenance window today from 16:15am to 16:20am UTC to add performance updates.
Tuesday 11 November 2008
XMPP failure in Cluster 1, restarted
XMPP cluster 1 failed. Restarted multiple times to reestablish all connections.
XMPP failure in Cluster 1, restarted
The XMPP service in the Cluster 1 failed. Restarted, all affected users will automatically reconnect.
Saturday 8 November 2008
XMPP restart of Cluster 1
We experienced memory shortage XMPP cluster 1 and restarted it to free some leaked memory.
Friday 7 November 2008
New XMPP Cluster
A second XMPP cluster has been installed. New users will be registered on the new cluster. This will reduce load on the first cluster.
Thursday 6 November 2008
Preparation for XMPP Cluster Changes
We are preparing for XMPP cluster config changes in the next 24 hours. We hope that the preparation does not affect the operation. We expect a XMPP reboot as worst case, but we hope to avoid it.
Tuesday 4 November 2008
Monday 3 November 2008
XMPP maintenance restart, new database service
The XMPP servers got their own database service. A restart is neccessary to enable the new configuration.
Sunday 2 November 2008
XMPP Maintenance restart
Due to a high load the XMPP cluster will be restarted as a safety measure at 8:30pm UTC.
XMPP Maintenance restart
Due to a high load a small part of the virtual world failed and the rest of the world has been restarted as a safety measure.
Saturday 1 November 2008
XMPP Failure and maintenance restart
Because of high load on the XMPP cluster a small part of the virtual world was unavailable. To avoid side effects some other parts has been restarted as a safety measure.
Friday 31 October 2008
Router Crash
The router failed. The secondary took over, but not all services are reachable from the outside world.
Update: the problem is solved. The resolution took very long. The network problem interfered with alert procedures. We are working to improve alerting in these cases.
Update: the problem is solved. The resolution took very long. The network problem interfered with alert procedures. We are working to improve alerting in these cases.
Saturday 25 October 2008
New registrations will be delayed.
New Weblin registrations will be delayed for a while. All other users can continue.
Update: Registrations should arrive soon now. Operation of email delivery back to normal.
Update: Registrations should arrive soon now. Operation of email delivery back to normal.
Thursday 23 October 2008
Network Problems
A network component shows errors, but not so much, that the fail over is activated. Random connection losses. We are working to identify the component.
Update: networking has been repaired.
Update: networking has been repaired.
Wednesday 22 October 2008
XMPP Server Failure [Update]
The XMPP cluster failed because of high load. The clients are trying to reconnect so the operation continues when the cluster is back.
Update: The cluster is back and running after 4 minutes.
Update: The cluster is back and running after 4 minutes.
MUC Server Failure
The Multi User Chat cluster partly failed and about half of the layered virtual world was offline.
Server has been restarted and the world is online again.
Server has been restarted and the world is online again.
Tuesday 21 October 2008
Upgrading XMPP Cluster
A new XMPP cluster is installed to cope with the growing load. The cluster will be rebooted (possibly several times) during the installaton. Connection to chat servers might be interrupted until everything is up and running.
The cluster has been prepared in advance. But the switch might still be bumpy, because of size and version upgrades at the same time.
The cluster has been prepared in advance. But the switch might still be bumpy, because of size and version upgrades at the same time.
Friday 17 October 2008
Thursday 16 October 2008
Monday 13 October 2008
XMPP Maintenance Shutdown [Update]
To introduce a new database the XMPP service will be restarted at about 11:10pm UTC.
Update: The XMPP services are back and running.
Update: The XMPP services are back and running.
Forum Down [Update]
Due to a server malfunction the forum is currently down. We work on it to repair the server.
Update: The forum service is back and running.
Update: The forum service is back and running.
Saturday 11 October 2008
Partial XMPP Maintenance Restart
Part of the XMPP cluster has been restarted as a safety measure.
Comment: weblin is experiencing very high load, which is a good thing. Due to the recent events, we are very closely watching the XMPP condition and restart occasionally to avoid dangerous conditions. New (more and larger) servers will be online soon.
Comment: weblin is experiencing very high load, which is a good thing. Due to the recent events, we are very closely watching the XMPP condition and restart occasionally to avoid dangerous conditions. New (more and larger) servers will be online soon.
Friday 10 October 2008
Wednesday 1 October 2008
XMPP Service Restart
We needed to restart the XMPP service to start a new module to improve our services.
Monday 29 September 2008
Parameter change in XMPP-Servers
We changed a parameter in our XMPP-Servers.
Main operation continues, clients automatically reconnect to other cluster hosts.
Main operation continues, clients automatically reconnect to other cluster hosts.
Saturday 27 September 2008
XMPP Server Failure
A part of the XMPP cluster failed.
Main operation continues, clients automatically reconnect to other cluster hosts.
Main operation continues, clients automatically reconnect to other cluster hosts.
Monday 22 September 2008
Scheduled XMPP downtime [Update]
We will have a scheduled XMPP maintenance window tonight from 22:00am to 22:05am UTC to add performance updates.
Update: The XMPP service is up and running again.
Update: The XMPP service is up and running again.
Friday 19 September 2008
Buddylist updates off
Buddylist updates remain deactivated.
Comment: there seems to be an instability of the XMPP cluster under the condition of a combined high load of Web and XMPP. This is a special case, but we prefer to keep buddy list updates disabled until the condition is fixed, rather than risking general XMPP operation.
Comment: there seems to be an instability of the XMPP cluster under the condition of a combined high load of Web and XMPP. This is a special case, but we prefer to keep buddy list updates disabled until the condition is fixed, rather than risking general XMPP operation.
Thursday 18 September 2008
Load Problems
We are experiencing high load. This results in very slow Web access and even XMPP cluster failure.
Stopping some sub systems, e.g. buddylist updates, topcloud.
Comment: most developers are working on optimizations. We are getting new all time highs every day and we are trying to keep up with the growth. This is an ongoing process.
Stopping some sub systems, e.g. buddylist updates, topcloud.
Comment: most developers are working on optimizations. We are getting new all time highs every day and we are trying to keep up with the growth. This is an ongoing process.
Tuesday 16 September 2008
XMPP Server Failure
A part of the XMPP cluster failed.
Main operation continues, clients automatically reconnect to other cluster hosts.
Operation completely restored after 3 minutes.
Main operation continues, clients automatically reconnect to other cluster hosts.
Operation completely restored after 3 minutes.
Sunday 14 September 2008
Topcloud briefly activated
Topclound has been activated for 1 h to check for high traffic sites in order to add more random rooms (see the LMS Operation Log for more). It is now again disabled until the re-write is completed.
Tuesday 9 September 2008
TopCloud Offline
Topcloud updates have been disabled because the processing might affect chat operation.
Comment: Topcloud processing will be changed. It is expected to resume early next week.
Comment: Topcloud processing will be changed. It is expected to resume early next week.
Sunday 7 September 2008
Chat Server Failure
location.virtual-presence.org failed. Most of the layered virtual world is offline.
Update: Server restart at 17:15
Comment: chat operation will partially be moved to other chat servers to reduce the load on location.virtual-presence.org
Update: Server restart at 17:15
Comment: chat operation will partially be moved to other chat servers to reduce the load on location.virtual-presence.org
XMPP Server Failure
A part of the XMPP cluster failed.
main operation continues, clients automatically reconnect to other cluster hosts. But followup errors affect client notifications, i.e. nickname changes and messages are not propagated to clients.
Update: Operation completely restored at 14:15
main operation continues, clients automatically reconnect to other cluster hosts. But followup errors affect client notifications, i.e. nickname changes and messages are not propagated to clients.
Update: Operation completely restored at 14:15
Wednesday 20 August 2008
Communication Problem
There is a s2s communication problem between the XMPP cluster and other servers.
Update 16:10: solved by XMPP server restart.
Update 16:10: solved by XMPP server restart.
Tuesday 19 August 2008
New Primary Database
We got a new even more powerful primary database. The hardware patched old primary is now first secondary. Second secondary will also be around.
Monday 18 August 2008
Hardware Failure
The current problem seems to be a hardware error on the main server and another hardware problem on the secondary.
We are aware, that this is virtually impossible. Nevertheless, the primary failed due to a hardware error and the secondary did not take over because of a different hardware related problem.
The hardware on the main server has been changed. Recover is under way. We expect that one of the DB servers will resume operation soon.
We are aware, that this is virtually impossible. Nevertheless, the primary failed due to a hardware error and the secondary did not take over because of a different hardware related problem.
The hardware on the main server has been changed. Recover is under way. We expect that one of the DB servers will resume operation soon.
Website and client login down
There is a serious database problem. The failover did not come up automatically. We are working to recover.
Saturday 16 August 2008
Database Issues
The database suffers under high load of many client connections.
We are trying to reduce the load by disabling services temporarily. Buddylist and others may be affected.
Update 11:00: still optimizing components. The situation improves gradually, but will nevertheless take time.
Update 13:00: Operation resumed, but some services disabled. Buddylist status updates, points for the weekend (big sorry), Toplist, and some less visible components. Most important: chat works and people can meet each other. The world is back online.
We are trying to reduce the load by disabling services temporarily. Buddylist and others may be affected.
Update 11:00: still optimizing components. The situation improves gradually, but will nevertheless take time.
Update 13:00: Operation resumed, but some services disabled. Buddylist status updates, points for the weekend (big sorry), Toplist, and some less visible components. Most important: chat works and people can meet each other. The world is back online.
Wednesday 13 August 2008
Database down for Maintenance
The DB server will be locked for some time (expected: 1h) for maintenance. The websites will be affected and no new logins possible. Users who stay logged in and do not navigate will be able to continue their chats.
Update@02:12: Back online.
Update@02:12: Back online.
Tuesday 12 August 2008
Sunday 10 August 2008
About the XMPP Authentication Refused Problem
Obviously the improved DB connection did not help as expected. There was a 2 hour outage on one of the cluster nodes.
This is a growth problem as not only server load grows, but also the effective coupling of sub systems by way of their increasingly loaded interfaces. Events once isolated begin to propagate between sub systems.
Investigation is under way. In addition, an alternate solution will be implemented today.
This is a growth problem as not only server load grows, but also the effective coupling of sub systems by way of their increasingly loaded interfaces. Events once isolated begin to propagate between sub systems.
Investigation is under way. In addition, an alternate solution will be implemented today.
Friday 8 August 2008
XMPP Server Reboot
Reboot to add improved DB connection module. Lets see if this makes the connections more stable.
XMPP Authentication Refused
One of the XMPP servers refuses to authenticate clients. DB connection problem.
Restarted the XMPP server. A solution is in the works. The client release soon to come will also be part of the solution.
Restarted the XMPP server. A solution is in the works. The client release soon to come will also be part of the solution.
Wednesday 6 August 2008
Release Overload
A portal software release results in unexpected high load. Please be patient and try not to overload the web site.
Update: normal operation restored.
Update: Topcloud intentionally offline
Update: normal operation restored.
Update: Topcloud intentionally offline
Friday 1 August 2008
XMPP Authentication Refused
One of the XMPP servers refuses to authenticate clients. DB connection problem.
(Restarting XMPP server + Taking actions to prevent the annoying client dialog box in the upcoming version. Improving DB connection.)
Update: Operation restored. Clients reconnect.
(Restarting XMPP server + Taking actions to prevent the annoying client dialog box in the upcoming version. Improving DB connection.)
Update: Operation restored. Clients reconnect.
Thursday 31 July 2008
XMPP Authentication Refused
One of the XMPP servers refuses to authenticate clients. DB connection problem.
(Restarting XMPP server)
Update: Clients reconnect, but initially affected clients show a dialog box.
(Restarting XMPP server)
Update: Clients reconnect, but initially affected clients show a dialog box.
Thursday 19 June 2008
Server restart
Database and XMPP servers have been restarted for maintenance.
(Comment: we are thinking about a regular scheduled downtime, but no decision yet)
(Comment: we are thinking about a regular scheduled downtime, but no decision yet)
Sunday 15 June 2008
Cache malfunction
A caching server is offline. The problem affects a part of the population.
Update: problems solved, but still starting (too) slow while the cache warms up.
Update: problems solved, but still starting (too) slow while the cache warms up.
Tuesday 27 May 2008
System malfunction
Monday 19 May 2008
Reconnect test
The XMPP server cluster will be rebooted under max load to test automatic reconnect of all clients.
Thursday 8 May 2008
Subscribe to:
Posts (Atom)