sv123 5 years ago

Alternate status page, of course showing all green across the board in typical Azure fashion. https://status2.azure.com/

  • dgritsko 5 years ago

    That link gives me a DNS error. Edit: Working now as of 6:08 PM EDT

  • Spivak 5 years ago

    Does this have to be posted every time some cloud service has issues?

    We, as in people in this forum, know that status pages are worthless. They’re tools with the explicit purpose of reducing the burden on tier one customer support. That’s it. They are not a public monitoring platform.

    • NicoJuicy 5 years ago

      Their health page was down too. This was the first place were i saw something.

      A colleague saw it on twitter.

      So yes, it's useful

    • slaymaker1907 5 years ago

      I was on support until a few minutes ago (SQL DB, not general Azure), and the status page was ironically my first indication something was up since said status page was having DNS issues.

    • niij 5 years ago

      Having a fully up to date status page is what prevents useless repetitive cases during degradation or an outage. If I'm having issues but you show all green, I'm submitting a case.

      • wahern 5 years ago

        For large incumbents, publishing a complete and accurate status page might not only be a recipe for bad press, but also lawsuits. There's significant downside and not much upside to telling the whole truth. It's entirely unsurprising that cloud providers like AWS or Azure would play definitional games w/ what constitutes an outage. Rolling 5m, 1% outage across your entire customer base? That's just a hiccup! If your staff has to field more confused questions and complaints, it's a relatively small price to pay.

    • nijave 5 years ago

      As an alternate example, Github tends to have a pretty good status page in my experience. It'll usually be up-to-date within minutes of people chatting about issues on work Slack and gets updated at a regular cadence with details.

      AWS on the other hand... We usually just reach out to our TAMs and say "Hey, our application monitoring is showing tons of errors interacting with service X--can you check your super secret internal dashboards and see what the deal is?" It's nice to at least have a "Yeah the service is completely hosed and in a bad place" or a "Yeah, some changes went wrong and are being rolled back". The former usually requires some sort of mitigation while the latter can largely be ignored

  • frompdx 5 years ago

    That page says Azure DNS is all good. Seems wrong.

jrochkind1 5 years ago

Github is also having problems. https://www.githubstatus.com/

Coincidence, or have they gotten around to moving some of their infrastructure to Azure since the MS acquisition?

  • dgritsko 5 years ago

    My guess is the latter.

  • nijave 5 years ago

    Afaik Actions (maybe Packages, too?) was always built on Azure. I think Github core is on bare-metal

    I'm guessing they probably get the elasticity of cloud while paying the wholesale or at-cost of the infrastructure (surely they get some discounts over the advertised price, at least)

NikolaeVarius 5 years ago

https://azure.microsoft.com/en-us/updates/azuredns100sla/?cd...

> Azure DNS is now being offered at a 100% availability SLA that's backed by our diverse, geo-redundant DNS infrastructure.

> With this update, Azure DNS guarantees that valid DNS requests will receive a response from at least one name server 100% of the time. For details, see the SLA definition.

This hasn't aged well

  • xyzzy_plugh 5 years ago

    If I understand this correctly, everything unavailable is eligible for a 25% credit, and if the downtime exceeds ~4 hours then it's a free month.

    Neat!

    • justizin 5 years ago

      free month of azure DNS or free month of everything that you can't reach because DNS is down? ;)

duncanawoods 5 years ago

Wow - I can't even install dotnet on a linux machine because the packages repo is down.

EMM_386 5 years ago

Noticeable spikes across the board ...

https://downdetector.com/

  • colpabar 5 years ago

    I have seen nothing to indicate this is the case, but is this what a DDoS attack looks like?

  • collectedparts 5 years ago

    Coinbase is ~completely down (?)

    I'm now seeing all outbound DNS lookups from my Heroku instances failing.

    What is going on?

reasonabl_human 5 years ago

Thank goodness I’m not on call!

  • NicoJuicy 5 years ago

    How do you know? Together with low TTL it seems plausible. But I don't understand how you can know it this fast.

    • mdeeks 5 years ago

      IcM is an internal tracking tool at Microsoft. It sounds like they looked at the tool and noticed this. I used to work there and use it as well.

semicolon_storm 5 years ago

Azure is not having a good year. Two major Active Directory outages, a major CosmosDB outage, and now this.

  • colpabar 5 years ago

    The sad part is that even though that feels like a lot given the time span, it doesn't feel that bad given it's Azure. These network issues have been plaguing us ever since we started using AD.

  • Trisell 5 years ago

    Apparently AD doesn’t scale. Who know. /s

  • yellowyacht 5 years ago

    Is that all between January and April?

partiallypro 5 years ago

Let's Encrypt is also down and can't issue certs

dvfjsdhgfv 5 years ago

These problems are actually good in that more organizations - especially big ones - realize that reliability is not something that you can outsource to a single provider and the problem magically disappears. Literally any service you depend on, from DNS to email, should be using redundancy so that when your basket gets squashed, you still have some eggs. Neither Amazon nor Microsoft will tell you that because vendor lock-in is in their best interest. You need to take care of it yourself otherwise you are completely at their mercy.

x3n0ph3n3 5 years ago

There have been some internal debates about setting up a secondary DNS in case Route53 somehow went down. My reasoning has thusfar been that if Route53 is down, there are probably other AWS services that we depend on that would also be down.

What do you guys think? Is secondary DNS in this case worth it?

  • aarmenaa 5 years ago

    We're moving towards dual DNS providers. We've been bit by DNS hosting failures too many times, and they're always painful because everything, including monitoring and control systems, end up dead or inaccessible. Not to mention the entire network being down is absurdly expensive if you're paying SLAs.

PerfectElement 5 years ago

I'm seeing some errors in my applications, but most requests are still working somehow.

  • NicoJuicy 5 years ago

    Perhaps the errors are only partially logged. Since it's a DNS issue ( and potential DDOS)

TrealTwan 5 years ago

Can't get to the status page even. Seeing issues with Microsoft Teams also.

ficklepickle 5 years ago

DNS responses for our app servers in Azure were failing and now they are taking ~4000ms. They have a really short TTL too, which really exacerbates the issue.

PlanetLotus 5 years ago

I haven't been able to get to the status page. I noticed the problem because I currently can't connect to Service Bus nor Storage.

Nelkins 5 years ago

Hm...I'm not able to access the status page, but some of my coworkers are (as of 5:35 PM EST).

lwansbrough 5 years ago

portal.azure.com and status.azure.com are still down in NA Pacific.

nickdothutton 5 years ago

They just cant seem to keep the lights on.

Narkov 5 years ago

It's always DNS!

balaziks 5 years ago

I guess the Protocol Police is doing its first major raid.