Friday, 19 December 2008
MUC update successful
The MUC service for muc1.virtual-presence.org has been updated successfully and works fine.
MUC update
We're updating the muc1.virtual-presence.org service... while that most of the virtual world is offline for a few seconds.
Thursday, 18 December 2008
Points migration finished
The points migration has been finished but the Toplins page will stay offline for a few days.
Tuesday, 16 December 2008
MUC server issue
The muc1.virtual-presence.org server has a connection issue, so most of the virtual world is offline.
We're working on it.
We're working on it.
Wednesday, 10 December 2008
Points migration
At the moment we transfer the points to another code base so users may notice a difference between the points while the migration is running.
Tuesday, 9 December 2008
Load balancer problems
Our load balancers have a problem with too many connections. Our website is not available at the moment, so several services are unavailable too, like login, wousle, contact list etc.
We're working on it.
Update: All services are unavailable...
2nd Update: All services are back online but there's still work to do with the load balancers.
We're working on it.
Update: All services are unavailable...
2nd Update: All services are back online but there's still work to do with the load balancers.
Thursday, 27 November 2008
Contactlist online
Today, we've reactivated the contact list. For new contacts the new infrastructure is already running.
At the moment, we're migrating the old contacts to the new infrastructure. This will take some hours.
At the moment, we're migrating the old contacts to the new infrastructure. This will take some hours.
Wednesday, 26 November 2008
XMPP Failure
One of our XMPP services failed and needed to be restarted. A part of the virtual world was unavailable for about 4 minutes.
Wednesday, 19 November 2008
Moving Users Between Clusters
We're moving some users between the clusters again to balance the load between them. Users will detect a normal short reconnect when they have moved.
Thursday, 13 November 2008
Moving Users Between Clusters
Currently we're moving some users between the XMPP clusters to balance the load between them. Users will detect a normal short reconnect when they have moved.
Wednesday, 12 November 2008
Chat Server Lag
A room server has very high lag. We are investigating.
Update: only the connection between XMPP cluster 1 and one chat server lags.
Update: during the investigation the room server crashed due to strange memory conditions. After the restart everything is back to normal.
Update: only the connection between XMPP cluster 1 and one chat server lags.
Update: during the investigation the room server crashed due to strange memory conditions. After the restart everything is back to normal.
Scheduled XMPP downtime
We will have a scheduled XMPP maintenance window today from 16:15am to 16:20am UTC to add performance updates.
Tuesday, 11 November 2008
XMPP failure in Cluster 1, restarted
XMPP cluster 1 failed. Restarted multiple times to reestablish all connections.
XMPP failure in Cluster 1, restarted
The XMPP service in the Cluster 1 failed. Restarted, all affected users will automatically reconnect.
Saturday, 8 November 2008
XMPP restart of Cluster 1
We experienced memory shortage XMPP cluster 1 and restarted it to free some leaked memory.
Friday, 7 November 2008
New XMPP Cluster
A second XMPP cluster has been installed. New users will be registered on the new cluster. This will reduce load on the first cluster.
Thursday, 6 November 2008
Preparation for XMPP Cluster Changes
We are preparing for XMPP cluster config changes in the next 24 hours. We hope that the preparation does not affect the operation. We expect a XMPP reboot as worst case, but we hope to avoid it.
Tuesday, 4 November 2008
Monday, 3 November 2008
XMPP maintenance restart, new database service
The XMPP servers got their own database service. A restart is neccessary to enable the new configuration.
Sunday, 2 November 2008
XMPP Maintenance restart
Due to a high load the XMPP cluster will be restarted as a safety measure at 8:30pm UTC.
XMPP Maintenance restart
Due to a high load a small part of the virtual world failed and the rest of the world has been restarted as a safety measure.
Saturday, 1 November 2008
XMPP Failure and maintenance restart
Because of high load on the XMPP cluster a small part of the virtual world was unavailable. To avoid side effects some other parts has been restarted as a safety measure.
Friday, 31 October 2008
Router Crash
The router failed. The secondary took over, but not all services are reachable from the outside world.
Update: the problem is solved. The resolution took very long. The network problem interfered with alert procedures. We are working to improve alerting in these cases.
Update: the problem is solved. The resolution took very long. The network problem interfered with alert procedures. We are working to improve alerting in these cases.
Saturday, 25 October 2008
New registrations will be delayed.
New Weblin registrations will be delayed for a while. All other users can continue.
Update: Registrations should arrive soon now. Operation of email delivery back to normal.
Update: Registrations should arrive soon now. Operation of email delivery back to normal.
Thursday, 23 October 2008
Network Problems
A network component shows errors, but not so much, that the fail over is activated. Random connection losses. We are working to identify the component.
Update: networking has been repaired.
Update: networking has been repaired.
Wednesday, 22 October 2008
XMPP Server Failure [Update]
The XMPP cluster failed because of high load. The clients are trying to reconnect so the operation continues when the cluster is back.
Update: The cluster is back and running after 4 minutes.
Update: The cluster is back and running after 4 minutes.
MUC Server Failure
The Multi User Chat cluster partly failed and about half of the layered virtual world was offline.
Server has been restarted and the world is online again.
Server has been restarted and the world is online again.
Tuesday, 21 October 2008
Upgrading XMPP Cluster
A new XMPP cluster is installed to cope with the growing load. The cluster will be rebooted (possibly several times) during the installaton. Connection to chat servers might be interrupted until everything is up and running.
The cluster has been prepared in advance. But the switch might still be bumpy, because of size and version upgrades at the same time.
The cluster has been prepared in advance. But the switch might still be bumpy, because of size and version upgrades at the same time.
Friday, 17 October 2008
Thursday, 16 October 2008
Monday, 13 October 2008
XMPP Maintenance Shutdown [Update]
To introduce a new database the XMPP service will be restarted at about 11:10pm UTC.
Update: The XMPP services are back and running.
Update: The XMPP services are back and running.
Forum Down [Update]
Due to a server malfunction the forum is currently down. We work on it to repair the server.
Update: The forum service is back and running.
Update: The forum service is back and running.
Saturday, 11 October 2008
Partial XMPP Maintenance Restart
Part of the XMPP cluster has been restarted as a safety measure.
Comment: weblin is experiencing very high load, which is a good thing. Due to the recent events, we are very closely watching the XMPP condition and restart occasionally to avoid dangerous conditions. New (more and larger) servers will be online soon.
Comment: weblin is experiencing very high load, which is a good thing. Due to the recent events, we are very closely watching the XMPP condition and restart occasionally to avoid dangerous conditions. New (more and larger) servers will be online soon.
Friday, 10 October 2008
Wednesday, 1 October 2008
XMPP Service Restart
We needed to restart the XMPP service to start a new module to improve our services.
Monday, 29 September 2008
Parameter change in XMPP-Servers
We changed a parameter in our XMPP-Servers.
Main operation continues, clients automatically reconnect to other cluster hosts.
Main operation continues, clients automatically reconnect to other cluster hosts.
Saturday, 27 September 2008
XMPP Server Failure
A part of the XMPP cluster failed.
Main operation continues, clients automatically reconnect to other cluster hosts.
Main operation continues, clients automatically reconnect to other cluster hosts.
Monday, 22 September 2008
Scheduled XMPP downtime [Update]
We will have a scheduled XMPP maintenance window tonight from 22:00am to 22:05am UTC to add performance updates.
Update: The XMPP service is up and running again.
Update: The XMPP service is up and running again.
Friday, 19 September 2008
Buddylist updates off
Buddylist updates remain deactivated.
Comment: there seems to be an instability of the XMPP cluster under the condition of a combined high load of Web and XMPP. This is a special case, but we prefer to keep buddy list updates disabled until the condition is fixed, rather than risking general XMPP operation.
Comment: there seems to be an instability of the XMPP cluster under the condition of a combined high load of Web and XMPP. This is a special case, but we prefer to keep buddy list updates disabled until the condition is fixed, rather than risking general XMPP operation.
Thursday, 18 September 2008
Load Problems
We are experiencing high load. This results in very slow Web access and even XMPP cluster failure.
Stopping some sub systems, e.g. buddylist updates, topcloud.
Comment: most developers are working on optimizations. We are getting new all time highs every day and we are trying to keep up with the growth. This is an ongoing process.
Stopping some sub systems, e.g. buddylist updates, topcloud.
Comment: most developers are working on optimizations. We are getting new all time highs every day and we are trying to keep up with the growth. This is an ongoing process.
Tuesday, 16 September 2008
XMPP Server Failure
A part of the XMPP cluster failed.
Main operation continues, clients automatically reconnect to other cluster hosts.
Operation completely restored after 3 minutes.
Main operation continues, clients automatically reconnect to other cluster hosts.
Operation completely restored after 3 minutes.
Sunday, 14 September 2008
Topcloud briefly activated
Topclound has been activated for 1 h to check for high traffic sites in order to add more random rooms (see the LMS Operation Log for more). It is now again disabled until the re-write is completed.
Tuesday, 9 September 2008
TopCloud Offline
Topcloud updates have been disabled because the processing might affect chat operation.
Comment: Topcloud processing will be changed. It is expected to resume early next week.
Comment: Topcloud processing will be changed. It is expected to resume early next week.
Sunday, 7 September 2008
Chat Server Failure
location.virtual-presence.org failed. Most of the layered virtual world is offline.
Update: Server restart at 17:15
Comment: chat operation will partially be moved to other chat servers to reduce the load on location.virtual-presence.org
Update: Server restart at 17:15
Comment: chat operation will partially be moved to other chat servers to reduce the load on location.virtual-presence.org
XMPP Server Failure
A part of the XMPP cluster failed.
main operation continues, clients automatically reconnect to other cluster hosts. But followup errors affect client notifications, i.e. nickname changes and messages are not propagated to clients.
Update: Operation completely restored at 14:15
main operation continues, clients automatically reconnect to other cluster hosts. But followup errors affect client notifications, i.e. nickname changes and messages are not propagated to clients.
Update: Operation completely restored at 14:15
Wednesday, 20 August 2008
Communication Problem
There is a s2s communication problem between the XMPP cluster and other servers.
Update 16:10: solved by XMPP server restart.
Update 16:10: solved by XMPP server restart.
Tuesday, 19 August 2008
New Primary Database
We got a new even more powerful primary database. The hardware patched old primary is now first secondary. Second secondary will also be around.
Monday, 18 August 2008
Hardware Failure
The current problem seems to be a hardware error on the main server and another hardware problem on the secondary.
We are aware, that this is virtually impossible. Nevertheless, the primary failed due to a hardware error and the secondary did not take over because of a different hardware related problem.
The hardware on the main server has been changed. Recover is under way. We expect that one of the DB servers will resume operation soon.
We are aware, that this is virtually impossible. Nevertheless, the primary failed due to a hardware error and the secondary did not take over because of a different hardware related problem.
The hardware on the main server has been changed. Recover is under way. We expect that one of the DB servers will resume operation soon.
Website and client login down
There is a serious database problem. The failover did not come up automatically. We are working to recover.
Saturday, 16 August 2008
Database Issues
The database suffers under high load of many client connections.
We are trying to reduce the load by disabling services temporarily. Buddylist and others may be affected.
Update 11:00: still optimizing components. The situation improves gradually, but will nevertheless take time.
Update 13:00: Operation resumed, but some services disabled. Buddylist status updates, points for the weekend (big sorry), Toplist, and some less visible components. Most important: chat works and people can meet each other. The world is back online.
We are trying to reduce the load by disabling services temporarily. Buddylist and others may be affected.
Update 11:00: still optimizing components. The situation improves gradually, but will nevertheless take time.
Update 13:00: Operation resumed, but some services disabled. Buddylist status updates, points for the weekend (big sorry), Toplist, and some less visible components. Most important: chat works and people can meet each other. The world is back online.
Wednesday, 13 August 2008
Database down for Maintenance
The DB server will be locked for some time (expected: 1h) for maintenance. The websites will be affected and no new logins possible. Users who stay logged in and do not navigate will be able to continue their chats.
Update@02:12: Back online.
Update@02:12: Back online.
Tuesday, 12 August 2008
Sunday, 10 August 2008
About the XMPP Authentication Refused Problem
Obviously the improved DB connection did not help as expected. There was a 2 hour outage on one of the cluster nodes.
This is a growth problem as not only server load grows, but also the effective coupling of sub systems by way of their increasingly loaded interfaces. Events once isolated begin to propagate between sub systems.
Investigation is under way. In addition, an alternate solution will be implemented today.
This is a growth problem as not only server load grows, but also the effective coupling of sub systems by way of their increasingly loaded interfaces. Events once isolated begin to propagate between sub systems.
Investigation is under way. In addition, an alternate solution will be implemented today.
Friday, 8 August 2008
XMPP Server Reboot
Reboot to add improved DB connection module. Lets see if this makes the connections more stable.
XMPP Authentication Refused
One of the XMPP servers refuses to authenticate clients. DB connection problem.
Restarted the XMPP server. A solution is in the works. The client release soon to come will also be part of the solution.
Restarted the XMPP server. A solution is in the works. The client release soon to come will also be part of the solution.
Wednesday, 6 August 2008
Release Overload
A portal software release results in unexpected high load. Please be patient and try not to overload the web site.
Update: normal operation restored.
Update: Topcloud intentionally offline
Update: normal operation restored.
Update: Topcloud intentionally offline
Friday, 1 August 2008
XMPP Authentication Refused
One of the XMPP servers refuses to authenticate clients. DB connection problem.
(Restarting XMPP server + Taking actions to prevent the annoying client dialog box in the upcoming version. Improving DB connection.)
Update: Operation restored. Clients reconnect.
(Restarting XMPP server + Taking actions to prevent the annoying client dialog box in the upcoming version. Improving DB connection.)
Update: Operation restored. Clients reconnect.
Thursday, 31 July 2008
XMPP Authentication Refused
One of the XMPP servers refuses to authenticate clients. DB connection problem.
(Restarting XMPP server)
Update: Clients reconnect, but initially affected clients show a dialog box.
(Restarting XMPP server)
Update: Clients reconnect, but initially affected clients show a dialog box.
Thursday, 19 June 2008
Server restart
Database and XMPP servers have been restarted for maintenance.
(Comment: we are thinking about a regular scheduled downtime, but no decision yet)
(Comment: we are thinking about a regular scheduled downtime, but no decision yet)
Sunday, 15 June 2008
Cache malfunction
A caching server is offline. The problem affects a part of the population.
Update: problems solved, but still starting (too) slow while the cache warms up.
Update: problems solved, but still starting (too) slow while the cache warms up.
Tuesday, 27 May 2008
System malfunction
Monday, 19 May 2008
Reconnect test
The XMPP server cluster will be rebooted under max load to test automatic reconnect of all clients.
Thursday, 8 May 2008
Subscribe to:
Posts (Atom)