32. Working with Service Bus, Event Hubs, Event Grid, CloudEvents and Relay Service - with Clemens Vasters Artwork

Betatalks the podcast

Welcome to Betatalks the podcast, in which we talk with friends from the development community. We chat not only about technology, but what drives them, inspires them and makes them unique. Rick (Cloud Solution Architect at Microsoft) and Oscar (CTO at Virtual Vaults), invite developers, makers, Open Source maintainers and other amazing people from the .NET and Azure development community. Looking for more content? Have a look at our Betatalks video's.

All Episodes

Betatalks the podcast

32. Working with Service Bus, Event Hubs, Event Grid, CloudEvents and Relay Service - with Clemens Vasters

June 13, 2022 • Rick & Oscar with Clemens Vasters • Episode 32

In this episode, we talk to Clemens Vasters. He works at Microsoft on the Azure product engineering team. Specifically, he is a Principal Architect for Service Bus, Event Hubs, Event Grid and Relay. We dive into these services, what they can do and how they differ. Clemens explains why and when to use events or messages, and what types of scenarios they fit into; from discrete events and event series to jobs, commands and processing. And we discuss working with CloudEvents in Event Grid. Furthermore, we talk about the origins of the term Red Dog, how Azure emerged, easter eggs, and the Relay Service, which even predates Azure, and the evolution of the service.

About this episode, and Clemens in particular: you can find @clemensv on Twitter or check out some of his blogs.

About Betatalks: have a look at our videos and join us on our Betatalks Discord channel

00:00 - Introduction
01:43 - Friend of the day
02:50 - The relay service
08:16 - When to use events or messages
14:44 - What is a workflow and how is it different from a queue
16:30 - The idea behind could events
25:44 - Using cloud events
33:12 - Totally random question
35:07 - The origins of Red Dog and how it came about
43:07 - What’s next in messaging and events
49:57 - Closing

Introduction - 00:00
Oscar
Hey there, welcome to Betatalks the podcast in which we talk to friends from the development community.

Rick
I'm Rick,

Oscar
And I am Oscar.

Rick
Oscar, how have you been? What have you been up to?

Oscar
I have been good, been up to multiple things, multiple projects I was playing around. And we actually made a video on this, like I played around with, I don't know, you know it, I know it, with relay service in Azure.

Rick
It because first we're looking at, oh, we have to punch holes in the firewall to make sure that the internet connected service can talk to our on premises service.

Oscar
Yeah. And then, you know, we've talked to some network, guys, let's avoid that,

Rick
Who are not all too happy.

Oscar
No, and it was just a temporary setup. So we kind of made a good setup there. And now I really see some use in different scenarios there. Because we had a closed on prem. And we actually had some devices there behind the firewall. And now we open it up, and we can continue as if it was already placed in a new situation. So

Rick
Yeah, so especially in these types of migration scenarios, or hybrid scenarios, because well, there is some hardware locally that you actually need to address, but the rest of your applications running in the cloud, then, Azure relay a pretty neat option.

Oscar
Yeah, it was good. And I remember that we used in the past, of course, but I completely forgot about the service. And that's the thing now with Azure, right? Like, there is so many so that you start forgetting about there, even there until you really have a scenario again.

Rick
Yeah, true.

Oscar
So yeah, that was good.

Friend of the day - 01:43

Rick
Cool. So Oscar, who's our friend of today?

Oscar
Our friend of the day is Clements Vasters.

Rick
Clemens Vasters works at Microsoft on the Microsoft Azure product engineering team. He is a principal architect for Service Bus Event Hubs event grid, and relay laying the groundwork for future development work and driving technical strategy. He is also a co chair of the Oasis AMQP Technical Committee, member of the MQ TT TC. And next to that he represents Microsoft in the CN CF cloud events Working Group. His personal passion is researching military aviation history, especially of the Cold War. And he has an extensive public photo archive from some of the most significant aviation museums in the world.

Oscar
Welcome, Clemens.

Rick
Welcome.

Clemens
Hello. I'm very pleased that you have me and it's an honor to be on this podcast.

Oscar
Oh, it's an honor to have you here. You're kind of a legend in messaging, right. Whilst I'm just doing my job.

Rick
Apparently, you're doing a good one.

The relay service - 02:50

Oscar
Yeah, I think you've, you have your signature all over a lot of services I've been using in the past. And also, I think for the relay service we were just talking about, right?

Clemens
Yeah, the relay is actually the oldest, by far the oldest service in the Azure service portfolio. Because it existed before Azure was even named Azure. Or there was even a coherent cloud strategy at Microsoft, the first public preview, if you will, was under the name life laps in May 2006. Wow. And yes, and the principle of the relay hasn't really changed since then. I think we've meanwhile, killed all the code from back in the day, but it's the all Service Bus kind of holds back to that original, that original service. And relay is, by now has changed protocols, the primary protocol, so that's now purely WebSocket based. But in principle, it's still the same thing and helps with the same problem.

Oscar
Yeah, I remember, relay was a topic when WCF was really big. And one of the cool demos was always having WCF talking to WCF on the on the same environment and it actually detected you are both behind the same firewall and it just upgraded the connection. It was really magical demo always to do

Clemens
It. So that part actually had even an even cooler aspect to it. And that's actually still in the, if you use the Windows Azure clients, so we have we have three or four .NET clients. It's kind of overtime they change. And the oldest one that is still out there because it's so that is the one that still supports dub, dub CF and has the relay client in it. There is on the Net TCP relay binding, there's a switch called hybrid. And when you turn that on, so this is something you can use today. If you turn that on, we're using a special mechanism where you can have two endpoints that are sitting behind firewalls, you can have a listener that sits behind the NAT, and firewall and the client also sits behind the net on the firewall. And we're making a prediction about the behavior of the net device. As, the connection is being established. So the connection is being established through the relay. And then they're effectively talking about each other through the relay and making a guess, which ports on the net on the NATs will open next, because the way how nats allocate their outbound and inbound ports is basically they go in the 32,000 range and going one by one by one by one. So the next connection gets the next port. And so we're making a guess about that. And then we're effectively snapping a socket through those firewalls and through those nets, making effectively a two outbound connections which meet each other in the middle. And that makes us possible to snap a socket that is direct between those two parties, which both sits behind NATs and firewalls. Through a guessing game and negotiation that the relay helps with, it's pretty magic, you end up with a direct socket.

Rick
And I think this is the cool thing. If we start this type of discussion, and immediately we go into this level of tech detail, this is awesome. This is it's cool.

Clemens
So it's just because he started on it. And I thought I just give you a little technical nugget here. Just to start the discussion. It's some amazing tech and the relay has helped many customers with what you just open the story with. And that is, I have some service that sits behind a firewall. And I need to go and reach that from the cloud or I need to reach out from clients. And I need to do a quick tech tactical thing to go and just get to it. And we have many customers who have started with that. To say, oh, I need just need to have a little POC and a trick to just avoid talking to my networking people and create these routes and do all that work and VPN and whatever. And these tactical POCs tend to be super sticky. And end up being the thing that customers use.

Oscar
Yeah. And in the end, like the proof of concept, also what we did, like, oh, yeah, this works and like, it kind of works already. It's so low code, and it's done. And it's actually more secure than anything you would set up by punching holes in the firewall. And so I can imagine it's sticky.

When to use events or messages - 08:16
Rick
That's all of these temporary solutions that are in production are permanently probably, as they tend to be. Let's take a few steps back Clemens, because you work on both service bus, but also event hubs and also event grid. Now I know that there's, there's different types of events and then events are for more specific needs, then messages are, could you on a somewhat higher level, try and explain why you would use either events or messages or in which types of scenarios that would fit?

Clemens
Yeah. So we have, as you correctly observed, we have three services, and an arguably we even have eight. But let's stick to the three words that you just that you just mentioned. Let's first The first distinction I would draw is between jobs and commands on one hand, and then events on the other side. There are different characteristics to those messages, even though they are they look the same. So the art semantic differences in in an event versus a job or commands that cause different architectural decisions. An event is a statement of fact about something that just happened. And that statement of fact, is something that you can then go into React to if it's what I call the discrete event, which means it is a like a fire alarm. Right a Fire alarm is pretty binary.

Rick
Yeah. So that's a status change, right?

Clemens
Yes, that is a pretty harsh say change in from a tranquility into panic.

Oscar
First question

Clemens
And so if, if a smoke detector detects smoke, you will go and react, then if so that's a discrete event that's immediately actionable. And it's a statement about something that happened. And now, you may react to it once, or you may react to it twice. And there may be two or three or four parties which go and react to it at the same time. And they're not being told to do anything about the fact the smoke sensor just says, Oh, my God, there is smoke here. But it has no notion of what's going to happen now. It's just doing its job. So it's raising that event. And now, four or five parties spring into action one calls the fire brigades, the other one goes to start sprinklers, the other one is, so there's multiple parties, which are just going to go back to that signal. Then there are other kinds of events, which are not immediately actionable, or it would be silly, if you were actioning them immediately. Let's say that same device has a temperature sensor, and the temperature sensor will go and measure the temperature every five seconds, and send a message, it measures it the whole time. But every five seconds or every 10 seconds, central sends a message that says I'm observing the following temperature. Now temperature in a room fluctuates, people walk through it, there's a wind, the wind blows through it. And it fluctuates between, you know, two, three degrees, you know, from one reading to the next. And you probably don't want to go and react to it to every single one reading but you what you want to do is you want to go and create an average temperature reading over the last five minutes or the last 10 minutes. And if that hits a threshold, then you have effectively a derivative function that then says, We crossed the threshold, that is a state change. And now you go and effectively start reacting to this. But to form this average, you need to go and observe that event stream, which means there is a number of events which are coming from one source that you want to go and keep together. And that you want to then observe and calculate over together. So that's the second kind of event. So there's discrete events, and then there's events, if you're coming in serious job, jobs and commands are different jobs are I have the intent that I want something done. And then I'm going to give that to a party that I don't really have to know, it might be behind the magic curtain. But I want I know that I need to get this job done once I need to have this product order shipped. But I don't want to have this product order shipped twice. I want to have this product order shipped once. So that's the job that you placing into an infrastructure.

Rick
Yeah. So and in that case, you actually have you're expecting something to happen with what you're actually sending, right?

Clemens
Correct. The sender? Yes. So the sender's intent is actually different. When you're sending events, either a series or as discrete events, you have no expectation that anybody's going to act on them, you inform everybody that, hey, this thing happens. And then they can go and choose of whether they want to go and act on it. And that is the whole notion of event driven architecture is that everybody reports out the state changes that they make. And then they make it the observer or the subscribers decision of whether they want to go and act on it. The nice thing about that is that you can now use that as a very flexible extensibility mechanism. Right, you can go and say, you can raise the fact that a an object inside of it of your of your business application has changed. And now someone else on the outside can go and attach to the change of that particular object. And now create functionality based on that change. That wasn't really intended in the original design of that application. But it can very easily be extended because you're just reporting off those state changes.

Oscar
I think in architecture like that will define or it will actually isolate your parts of your applications like it's a model you're pushing your architecture in by saying like this, this is what happened. Let it go. And it is extensible for things you didn't even plan yet. And that's also the good thing because you can also move services upgrade change them over time, without having to having some dependency there, like it's really creating a loose coupling for your components. And I did see some implementations where I saw the more like event setups like pops up or, or even grid, where they are used as an internal start of a workflow. And the thing is, like, I'm always really precise on naming those things. If it's an event style, I'll push it team to say, altered, let's make the name of the thing, also an only event that like really feels like an event. Otherwise, you get that there will hammering like a workflow that have multiple steps in something that should only say, Well, I just did this. Because otherwise they should use a queue or anything else.

What is a workflow and how is it different from a queue - 14:44
Clemens
Yeah, the workflow is an example of something that is effectively a sequence of jobs that were having a particular expectation of the flow. So workflows are often executed with queues in the middle. And that's so because you have a particular execute execution plan in mind for workflow. And as you are proceeding from one step to the next, you want to make sure that you're handing effectively handing over control of the work of the workflow state to the next step. But you want to make sure that that that that happens exactly once that handover of control, and that the next step happens, No, exactly once or as much, as much once as you possibly can make it. But let me just bind the concepts that I just talked about quickly to the product. So discrete events, have the characteristic that you can handle them singly without having to observe order. And you can also go and push them out to individual handlers, that's discrete events is event grid in our world, even serious, you need to go and keep events together. And then you need to go and pull them towards the place where the information is being aggregated and analyzed. So that's a pull style notion, and you need to have a notion of, of Order, order of events relative to each other. And if you scale those things out, you need to go and keep related events together in a in one place, so they don't get out of order that one plays when you scale those things out as a partition. So that is Event Hubs, and then everything that has to do with jobs and with commands and with processing, once processing, where you need to have exclusive ownership of a job while you processes etc. All of that is service bus. So Service Bus is kind of the most versatile service, because it has it has a Pub Sub capabilities, that you can use it for events. And then event stream Event Hubs is with you need to have torrents of event streams, that you need to move very fast. And then event grid is kind of the universal dispatch pops up engine for discrete events.

The idea behind could events - 16:30
Rick
I think that's a very clear depiction of the different types of services. So thanks for that one. And now diving into event grid a bit more event grid is a relative new addition to Azure, or at least in the public space. And pretty fast after its introduction, it also started supporting cloud events. So the open source or the open standard cloud events. What was the general idea behind that? I mean, I think I understand but could you explain a bit about how that went along?

Clemens
Cloud events is an effort that we have joined, that was initiated by Google initially. And the firm are an A company called serverless, Inc. And they have asked us and then eventually IBM to join the efforts. And then AWS was there for some time, but now there's a fairly stable group of participants. That also includes VMware, that includes PayPal and some others. I should also name solace, and I should name a Sentia and the observation we all have was that If the concept of event wasn't standardized, the way that you could say that you could point at an event that said, this is an event, and even if you could, then the way how an event is expressed in a message was wildly different. Someone came into one of the early discussions with a dispatcher code. So an AWS lambda, which was trying to effectively catch and dispatch the various types of AWS events that existed in that platform. And it was a terrible piece of spaghetti code, because every service in AWS has a very different opinion of what an event, event might be, and had different fields for what the type of the event is. And they have different structures for what the type of the event is, and different ways of to declare whether an event came from came from what that what that belongs to, and all of that. So completely complete chaos, if you want to go and bring that together in a single application. That's kind of where we started with. And, and everybody kind of the Google people could bring a similar example. And we could bring similar examples. And it was pretty clear that the simplest thing you could do was to define what an event is. So write down some text on that set up? And then second, find out what are the common attributes of events so that you can handle an event as a type? And then we would we would accept that the event per se, like the event data would be something that we could probably not standardize? Because it's going to be custom for every single application. And it's going to be custom in terms of encodings, etc. But that we should be able to go in and despite an event with a type with a source with a subject, what is that event is about with, you know, what's the timestamp with an identifier, like all those and then a declaration what's inside of the data of the event? Like, what's the content type of it? And what might the schema be for that, for that, for that data. And which, by now it was an exhaustive explanation of what the standard does. And that's where we arrived, like, we want to have a common notion of event. So what we did at event grid, on our side, is to take that as the event format for event grid going forward, there's still a bit of a mix today, there's, we still have a proprietary format, we have the cloud events format, where you can always subscribe in the cloud events, format, and then the cloud events has also taken hold in K native, it's being used in the Azure Web pops up service. As another place of Azure, it's now kind of have has moved into the core of the Azure SDKs. And there are several other external parties competitors, like solace, and it's an audio etc., which have been cooperating with us. And they are offering cognitive support in their own products.

Rick
I do think that that's a good addition and a good way of actually making something an open standard that multiple cloud platforms can adhere to, because it enables a lot of strategies that companies have where they might have multi cloud solutions. So I think it's a very good thing that stuff like this is happening.

Clemens
Yeah, it's it. I think so too. And I think this, specifically cloud events is also something completely independent of the infrastructures. The way how we design cloud events is that it can be used with any existing infrastructure. We didn't invent a new protocol, we didn't invent a new transport in very intentionally, will be created as such that it can be used with existing infrastructure and with existing SDKs. So if you're currently a customer who's a customer of any messaging platform, you can start using Cloud events, the great value in cloud event, first and foremost is that if you are a large if you're working in a large enterprise, or if you are an ISV, who's building applications for companies, using Cloud events as your standard event format, allows unification of the strategy of you deal with the with events inside of your applications. And then it's great. So all the modules that you have in your in your app right all the microservices that you have in your app, spew out events, exactly the same format. That's the power of standardization. It's not only a relationship to the infrastructure, it's actually also standardization of an eventing format within your apps, because then your, the various teams that are working to collaborate to build that app, don't need to feud over what an event looks like or what an event ought to be, because you're just pointing all the teams to that is the spec. That's how you do events. And that's the power of centralization. I'm also inside of an organization completely irrespective of whether there's infrastructure that natively supports it.

Using cloud events - 25:44
Oscar
If you're looking at, like event grid is an implementation in Azure. And indeed, cloud events are then what Rick said, you could do multi cloud or I move over even. But I can imagine the behavior of the actual like, in this case, event grid can differ a bit from other implementations, because there's retrying, and all kinds of things around that. Like, how compatible is that? Because I don't think that's caught in the format. Right?

Clemens
It's, yeah, so that's, so when you use cloud, so Cloud events is different from a very intentionally so different from, for instance, soap. And you can believe me that in the cloud events working group, especially because there are some people who are also my age, there are lots of soap jokes. The soap, so it's not, it's effectively an envelope format, right. But it's an envelope format that has more opinion. And at the same time, that is probably a little bit more humble than soap was, with soap, you have a very strong set of prescriptions about how transport ought to ought to work, right, there's a very clear prescription of how you need to use soap with s HCP. And there's a very crude prescription with other bindings. And we intentionally didn't do this with Cloud events. So from a comfortability perspective, if you use the cloud event, JSON format, or if you use the cloud event, Avro four, if you use any of those formats, what you end up with is an event that has a that is effectively the payload of a message. And then you just use that payload of the message with any messaging infrastructure that you have today. And, and if you want to go and post the event using HTTP, then we have a prescriptive prescription of how we suggest that you bind that event to an HCP frame. But we are not prescriptive about how whether you need to post it in a certain way where they can put it or where they can go and return it as the as the result of a delete operation. HTTP is pretty compatible across implementations and event grid to get to your question, except messages by you giving a token in the authorization header, that's the minimal thing we need to be able to do. And then otherwise, it's just a plain post. And you just provide the message in the payload grid is fairly simple in that way. So the way how you do cloud events is you provide the cloud events, media type in content type, and then you just post the cloud events, a JSON cloud events in the in the in the body, and you're there.

Rick
So in the end, it's the specification of how the data comes in, but not how it's handled by the respective messaging or eventing architecture solution.

Clemens
Correct. What we what we have taken care of with our transport bindings for cloud events. And that is something that is more for the platform builders than for users is that if you are sending a cloud event, via let's say, an AMQP broker, you start with a cloud events as an in memory representation. And we have these SDKs that we built in the cloud event project as kind of reference implementation for how a cloud event looks in memory. And then you go and take that cloud event and you run it through AMQP. And it comes out on the other side of the AMQP broker in an AMQP message. And we have prescribed how the condiment needs to be mapped onto the AMQP message and how the card read needs to be taken off the AMQP message in a way that the memory in memory model stays the same. And we've done the same for HTTP we've done the same for MQTT we've done the same for Kafka. such that the goal is that you can go and take a cloud event and you can stick it into any of these transports that's gonna come out on the other side unharmed. Because what we anticipate to happen in the future, more and more, and we can already see this in many places is that you will raise an event from a device that sits in the house behind a home hub. And from that Home Hub, it gets routed into an IoT gateway. And from that IoT gateway, it goes into a stream engine, and from that stream engine, it goes to two data lake, or it goes into stream analytics engine. And that gets evaluated there. So there's kind of three four hops that that same event takes through those various stages. What you often can't do, as he can't often can't control which protocols are being used on those various hops, right, there might be a mix of protocols that exist. And there's a good reason for those protocols to be to be different. And they often depend on choices that are not made by a single person, or even a single team, but are the result of collaborations or sometimes even conflicting decisions that are being made by managers who have no insight into technology. But like the sales pitch by a particular vendor, whatever we end up however, in the reality of, there's an event that needs to be routed through three hops into a particular party. So what we've tried to do with events with Cloud events is to have a clear definition of how cloud events projected into each transport. So you can go and put it in and get it out and it stays the same. So that you can then go and take that event, and then put that onto the next transport. And again, same happens so that the data kind of gets through, including all the metadata that's required, and that each stop, you can go and act on the metadata that's on the message that's on that event, to go and get it to the right place. That was that was the goal behind of all of it. And to be frank, last word. Standards take a while to kind of settle into the community and to be take hold in applications. And with Cloud events. We're now in year three. And I have a very long, long view on these things. And I think condiments isn't a good path. And we're probably going to see broad adoption, it's going to take another three, two, three years. With AMQP. We've also seen this take a long time that was kind of finalized in 2012. And really only in the last three, four years has become as important as it is today.

Rick
Yeah. And then with the increased use of cloud and the increased number of applications being run there. Probably. This will go a bit faster than AMQP did.

Clemens
Yeah. And I hope so. Yes. Yeah.

Totally random question - 33:12

Rick
Oscar, do you know what time it is?

Oscar
Is it time for a totally random question? It is time for a totally random question. Clemens, what is your favorite photograph of all time?

Clemens
Oh my, my favorite photograph of all time? That's a really random question.

Rick
It doesn't necessarily have to be one of your own, but it can be.

Oscar
Yeah, well, I had a few random questions here on the list and like I saw you are into photography. So

Clemens
I am that makes answering the question really hard.

Rick
You could do a top three.

Oscar
A top 50.

Rick
That would be a separate episode. Probably. Wow. That really hits me out of left field, as I would say with a US idiom. I take a lot of photos. I'm going to answer I'm going to be evasive with answering a question. I am taking photos in aviation museums I go to I go to air shows but I also drive around and just take photos in cities and older towns or Germany or older towns in the Netherlands or in Belgium kind of everything that is in reach and I think my favorite photo right now at this moment is a photo that I'm just looking at and that is of a Ukrainian Su 72 Su 27 flanker because I wish those guys in the Ukrainian Air Force a lot luck. How about that as an answer? I think that's a great answer.

Oscar
I think it counts. Yeah, definitely.

The origins of Red Dog and how it came about - 35:07
Rick
There's just one question or at least, okay, so I need to go about this differently. There's this rumor slash story I heard. And the fact that you talked about Azure before it was named Azure actually triggered me about that story.

Oscar
We're not gonna talk about Red Dog, right?

Rick
Well, I did hear the story about the pink poodle, and Red Dog. And I really need to know if it's true, because I've been hearing this story for more than 10 years now. Or is that something you can not comment on?

Clemens
That the origin of the term Red Dog? I don't know, the, I don't know what red dog came from. And I can and that is because we had nothing to do with these people. When I said these people, Azure comes Azure is a merger of three tribes. The first tribe was everybody who's fled from Steve Sinofsky as he was taking over windows after the disaster of Vista. Which is pretty mean because VISTA was a pretty solid operating system was just was just was just too slow. And, but it was, it was thought that the Sinofsky would go and clean it up. And he instantly turned windows into dictatorship, firing lots of people from middle management, and those people had to go somewhere. And then Ray Ozzie was like, ah, you know, we have a thing here, you, you could all come here and also Dave Cutler, was very unhappy about how things went in Windows, and then also kind of needed to have a new place. And so they basically all went to the building 109 And, and said, what we're going to do here is secret? No, you go about your work, but we will not talk to you. So that was like the start of Azure was mostly this stuff being shipped, VISTA had shipped. And then they were gearing up for Windows seven, there was shuffling in the executive sphere. And then lots of people said, Well, I don't want to have anything to do with it. I need to go do something else. And then everybody kind of went to building 109, and one knew what they were doing. At the same time, that was kind of the dot nets effort, or the .NET framework had shipped WCF and had shifts Avalon web, Exactly. So here's me with the old code names.

Rick
And then the thing is, we recognize.

Clemens
Yeah, and the work and the workflow foundation.

Oscar
There was 35, right?

Clemens
That was three, that was 30. That shipped with VISTA, and then and then became three, five that had some, some additions to it. So all that stuff had shipped. And so as the shipping train weapon started winding down for those things, there was an effort being started. .NET services that I was part of where we build kind of an online version of workflow where we built the relay where we built the access control service. So a number of effectively services for developers, which were still all motivated by trying to get developers who are building on premises applications, to have a neutral rendezvous place in the cloud, more than then actually hosting code in the cloud. And then the third one was, that's the SQL people had ambitions for building effectively SQL databases in, in, in in a Microsoft hosted data center before I say cloud. So these were three efforts, which were kind of started separately and very independent of each other. And then got that Bob Muglia, who was the boss of the server tools business at the time, then decided to pull them all together and make them into a single platform. And so when we had the first PDC, 2008, just before this, we were all told, Hey, these other two efforts exist, and you're all now forming together a single platform. And that's kind of how Azure came into view. And that in these meetings before PDC 2008 That was the first time that I heard about the building 109 effort and Cloud services existing.

Rick
Okay. And then somebody of that division came up with the term Red Dog probably.

Clemens
Yeah, they had come with dad come up with the term Red Dog, apparently very early. And but I don't know exactly what the what the history of that list it's They were very proud of themselves and they some of those guys were wearing red shoes and but the exact the exact origin of Red Dog is something that I that I can't tell you. I can tell you why there was a burger master segment in Windows 3.1 data No, but Red Dog, I don't know.

Rick
Well, that might be an interesting, an interesting closing note.

Clemens
When Microsoft was in Bellevue, they were up by the highway that goes from Seattle to Redmond. And across on that highway is so basically on the other side of the highway from where Microsoft was, was a burger shop that still exists called burger master. And apparently, burger masters delivery service was on speed dial for Microsoft. So that's what everybody was getting their food from. And so in honor of that burger shop, they named the one of the data segments in Windows, they call that puppet burger.

Rick
Well, it's cool to see that I tend to look at myself and maybe at Oscar also is sometimes maybe a bit nerdy. Now, it's so awesome. It's so awesome to see that also within Microsoft that stuff like this happens. Because it's cool.

Clemens
Yeah, it's these Easter egg things at some point became very illegal. But they from old stories, they still exist. Because, because there's lots of stuff that has been security repercussions on one hand and then has, you know, cultural awareness implications, you know, you just put something in for fun. And then of course, someone else is gonna go and be it, inevitably, you're going to run into someone who finds that justifiably problematic. And so most of those things no longer exist. So you can't do them.

Rick
I think I remember a pinball being in an Excel. pinball machine.

Oscar
But I think we all did it. And we also, yeah, had some consequences at some point. We should grow up.

Clemens
That's correct. And that's why that's why now code names are mostly cities, or countries or something like this, like most of the refer to existing things, because they're hard to be controversial.

What’s next in messaging and events - 43:07
Oscar
So what's next in in messaging and events for you? What's on your radar now.

Clemens
In the CTF work, we're currently working on a message, catalog and discovery system. Or system. When a system mean system, I mean, the set of specs, we see a need in work that we're doing we as Microsoft are doing with some of our customers, we're also engaged in cloud events, to have metadata associated with messaging entities like the maker topic, or a service, bus queue or event hubs. Event Hub, that we can go and effectively attach information to those that says, These are the events which flow on this entity, both for restricting that ingress and for effectively informing subscribers of what to expect. And then that you can also when you have an event in hand, so you are discovering an events from some API catalog that you can then find out which of the endpoints that exist in the platform or an application raises those raises that event, where can you Where can you get that event from? That requires that you have first a schema registry for the payloads. So you can go and handle protobuf and Avro and all the schematize formats. That requires that you have an event catalog or message catalog, which is all the definitions of the event types and what the metadata is that goes with them and which schemas they use. And then you need to have a discovery system which basically binds those Event definitions, two endpoints, you can go and find out what endpoints exist, what event points, what endpoints, you can subscribe on, to get events, using what protocol. And so that's a set of things that we're currently working on in the open. In the cloud events working group. If you go to github.com/cloudevents/spec, there's a readme in that readme also, is how you get at the archives of our discussions. And you can also go and join the working group if you're interested in that. So that's a, that's one of the high order bits, in terms of what I'm currently working on. And then we're working on some very significant improvements of the services that we that we have, one thing that we just did recently, is we went GA with event ops premium, which now provides under 10 milliseconds end to end latency pretty, very consistently, which is the fact Yes, and it does so with replicating the data on disk across availability zones, which means when you give us an event, it gets copied three times, across availability zones between two different data centers, we get the x for the replication, and only then will we forward the message. And all of that happens in seven to 10 milliseconds.

Rick
That's impressive,

Clemens
Which is pretty yeah, it's pretty spectacular. And so that engine is kind of event hubs premium, it's the first incarnation of the first implementation, kind of where we use that new engine, and that new engine will go under all of our services over the next, you know, 12 to 18 months, which means that sort of perf gain is something you will also see in service bus. And you will also see, you know, the services, so things are going to get faster, significantly over time. And we have we believe that we still have about 20% potential in in that entire game. So we might, I'm not sure we can go and give an assurance about this. But I have some hope that we can go and drive event ops latency, even with this availability, source distribution with your flush to disk on every right to set under seven milliseconds.

Oscar
Oh, that sounds amazing. I'm really looking forward because I use these services a lot. Definitely keep an eye on that.

Clemens
And then lastly, I think lastly, is our never ending quest for reliability continues. We are at in excess of six nines in terms of reliability for Event Hubs. And we'll keep doing work to kind of make that better. And that means that you know, with six nines that there's a risk of there being you're encountering an error when you send a message every million messages. And so we're still trying to drive that further down.

Oscar
In the end that but that's the that's the thing, I use a lot of cloud and all your you can build all your systems item potent and recoverable. But the messaging you shouldn't lose, that's the thing you depend on.

Clemens
Yes, and that's, and that's why we put an enormous effort into reliability, into making sure that the error rates are low, because those are very complex systems that consists of consists of hundreds of machines that are all collaborating to kind of give you the illusion of having a very simple broker. And of course, those things, things move, there are things that go wrong, etc. And having a very low error rate. Even though there are things that are happening in the background. A very low error rate for you as a customer in your experience is enormously important.

Rick
It's an essential part of the service.

Clemens
And it's invisible work that you don't see as new features. But that keeps going on and takes in significant part of our time to improve reliability. Even though we are worldleading. We are still not satisfied with where we are.

Rick
I think the moment you get satisfied is the moment you will be overtaken by somebody else, right?

Clemens
That's correct. And that's why we were we keep pushing hard even though there is like compared for instance, if you compare two event types, there is nothing right now in the cloud or on premises that has reliability, rate that's even close.

Closing - 49:57

Rick
Wow, this is so cool. Oh, Is there anything that you would like to get back to or that you would like to add or maybe a final thought that you would like to share?

Clemens
Yes, so and that's based on some recent customers discussions that I've been having. First. I just mentioned cloud events. Customer level. So let's say application solution builder. Participation in the standardization efforts is very useful. Because as long as only the plumbers sit there and discuss, we're mostly focusing on the requirements of the plumber. So if you're an in solution builder, and if you're interested in using something like Cloud events, for your own applications as just to standardize your own your own things, it is useful to also consider whether you want to go and pick in and voice your opinion, in these sort of these kinds of centralization efforts. So that's, that's one. And then second, cloud services are not set in stone in terms of capabilities and features. And so feature requests ideas, always are always welcome. We will, if there is a request that only gets a single vote, then that's kind of hard to justify the investment for there are some investments which are being asked for, with a with loud voices, that we have a hard time funding, also. But feature requests, something that kind of drives things forward and interesting scenario. So now, don't accept everything that is there as this is this is how good is going to be but there's always potential for evolution. So engage with us through the channels that we have on the websites, or you can also go in and engage with me directly, for instance, or our support. I'm on Twitter as @ClemensV. And if you have ideas, things that we should do things that we should have done and God why, why are you not doing that, you're idiots. Then we're open to get feedback. It's a two way road. Awesome. Thank you so much for being our guest today. It was a great pleasure. Thank you.

Rick
Thank you for listening to Betatalks the podcasts. We publish a new episode every two weeks.

Oscar
You can find out and all the major streaming platforms like Spotify and iTunes.

Rick
See you next time.

Oscar
Bye.