Identity / Login

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Web Interface API

99.39% uptime
Apr 2024 · 99.78%May · 98.44%Jun · 100.0%
Apr 202499.78% uptime
May 202498.44% uptime
Jun 2024100.0% uptime

Backend Data Collection

99.98% uptime
Apr 2024 · 99.97%May · 100.0%Jun · 99.96%
Apr 202499.97% uptime
May 2024100.0% uptime
Jun 202499.96% uptime

Agent Service

100.0% uptime
Apr 2024 · 99.99%May · 100.0%Jun · 100.0%
Apr 202499.99% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Notice history

Jun 2024

May 2024

Performance Issue
  • Resolved
    Update

    Some final details about this incident:

    We found an insidious "lock" that was induced by arrival of new data being analyzed by an alert while a user was scrolling through a time graph of the same target getting data. In an isolated (low chance-based) case, this would cause a deadlock, which would then back up the data analysis server (of which we run many in parallel). That deadlock could (but didn't always) take several minutes to release, and while that was happening, continued actions on that server would back up. Depending on how busy the server was, it could be unusable for many minutes, which would affect access to any sessions being processed by that server. Whoa, complicated!

    This bug happened on Friday (May 3rd) morning twice, and has not happened since (once we found a way to recognize it, we've been keeping a close eye on it and remediating before it created an issue).

    We rolled out a fix for that bug yesterday, and feel confident that that particular bug is squashed.

    Thanks for your understanding and patience!

  • Resolved
    Resolved

    Our team has successfully identified the root cause of the issue. While a comprehensive fix is still in progress, we have implemented effective remediation steps that have stabilized the system. At this time, the issue should no longer affect your user experience. We are diligently working on a permanent solution to ensure this issue does not recur.

  • Identified
    Identified

    Looks like there's another issue that we haven't yet identified because the problem came back up on a different server. We're remediating / investigating.

  • Resolved
    Resolved

    On of the back end "computation" servers was overloaded in an unexpected path. This caused other servers to mis-report their statuses, too, and the problem looked more widespread than it really was. Because of this, we restarted too much.

    Everything is fully operational - all reports, views, summaries and quality monitor views should be accurate and up to date. No data was lost. We are investigating ways to reduce the likelihood of a repeat event.

    Thanks for your patience and understanding!

  • Identified
    Identified

    Found an issue with one of the back end data collection servers. Restarting it. Most targets are working, and more coming back online.

  • Investigating
    Investigating

    There's an issue with viewing summaries and agent target lists. We're investigating.

Apr 2024

Unplannned issue with database connectivity
  • Resolved
    Resolved

    This incident has been resolved.

    At 17:57 MDT, Errors were reported while connecting to a database used by pingplotter.cloud and its login process. Our database was impacted by a regional issue which was successfully resolved by Cloud infrastructure engineers at 20:15 MDT.

  • Identified
    Identified

    Database connectivity has improved and is now nearly fully functional. We continue to see improvements and will provide updates as available.

  • Investigating
    Investigating

    At 1757 mst, operations were notified of an issue connecting to a database connected to pingplotter.cloud and its login process. we are investigating and will provide updates as available.

Apr 2024 to Jun 2024