Previous incidents

February 2024
Feb 01, 2024
2 incidents

Delay in model metrics

Degraded

Resolved Feb 01 at 02:01am UTC

New metrics are being populated again.

1 previous update

Delay in model metrics

Degraded

Resolved Feb 01 at 02:01am UTC

New metrics are being populated again.

1 previous update

January 2024
Jan 25, 2024
2 incidents

Metrics are degraded for some models

Degraded

Resolved Jan 25 at 02:03am UTC

We have resolved the issue, and metrics are available again.

3 previous updates

Metrics are degraded for some models

Degraded

Resolved Jan 25 at 02:03am UTC

We have resolved the issue, and metrics are available again.

3 previous updates

Jan 23, 2024
2 incidents

Logs are not showing up in the UI for some models

Degraded

Resolved Jan 23 at 09:02pm UTC

This incident has been resolved.

2 previous updates

Logs are not showing up in the UI for some models

Degraded

Resolved Jan 23 at 09:02pm UTC

This incident has been resolved.

2 previous updates

Jan 08, 2024
2 incidents

Degraded performance for some models using A100s

Degraded

Resolved Jan 08 at 06:27am UTC

A100s are fully operational again.

1 previous update

Degraded performance for some models using A100s

Degraded

Resolved Jan 08 at 06:27am UTC

A100s are fully operational again.

1 previous update

December 2023
Dec 08, 2023
2 incidents

Increased delay in starting replicas using A100 GPUs

Degraded

Resolved Dec 08 at 04:24pm UTC

This incident has been resolved. Start time for replicas using A100 GPUs is back to normal

3 previous updates

Increased delay in starting replicas using A100 GPUs

Degraded

Resolved Dec 08 at 04:24pm UTC

This incident has been resolved. Start time for replicas using A100 GPUs is back to normal

3 previous updates