Saturday 11 August 2012

Portal and Client Login Offline

Friday 8 pm (20:00 h) the portal including the client login script went offline. Operation has been restored.

Clients which were online continued to run. But new client logins were not possible. In other words: you could not start weblin.

Sorry for not noticing and for not reading email in the last 12 hours. Thanks to the people who notified me by email and on social netwoks.

Analysis:

There were too many apache processes running. I increased the number of max apache processes, because there is plenty of memory available.

But the main question is, why some apache processes do not terminate. The graph shows, that the behaviour started about 2 months ago in June. Until then there were only few processes running. There were no (known) configuration changes in June. I will observe the behavior and try to check what these processes were doing last. Unfortunately checking what they were serving before things stopped is not possible after things stopped.

Monday 21 May 2012

Unexpected Outage

The location mapping server seems to be down. Weblin and the Weblin servers work, but it is not possible to enter a page. Trying to fix the problem by moving the location mapping server to a different host.

UPDATE: Looks like the service has been restored. Reconnecting all clients to force new room enter.