On June 10 and 11, 2024, I attended the “Integrate 2024” event organized by Kovai. During the event, Microsoft announced a new feature: Geo-replication for Azure Service Bus.
The public preview was initially scheduled for June 17, 2024. However, it was finally launched on June 25, 2024, as announced by Eldert Grootenboer, Senior Product Manager at Microsoft, in an article published on the Microsoft Tech Blog.
In this article, you will learn about the benefits of this new feature.
First, note that this feature is only available on the Premium tier of Service Bus. Additionally, geo-replication currently supports only one secondary region—we will return to this concept later.
Microsoft also specifies that this is a public preview, meaning it should not be used in production environments. Moreover, it is only available in selected regions for now. The availability will gradually expand—you can track the progress in the official Microsoft documentation.
Before geo-replication, Azure offered Geo-disaster recovery for Service Bus. This feature ensures the integrity of metadata (entities, configurations, properties) by coupling a secondary namespace with the primary namespace currently in use.
In case of a failure or disaster, metadata is switched from the primary namespace to the secondary namespace. Replication is continuous, and failover is almost instantaneous.
To understand this method in more detail, refer to Microsoft’s documentation.
As mentioned in the previous section, Geo-disaster recovery protects only metadata. To extend its offering, Microsoft introduced geo-replication, which also protects data.
With this feature, the following are safeguarded:
The core principles remain the same:
Here is a diagram illustrating how geo-replication works:
When everything is functioning normally, producers and consumers of Service Bus messages connect to the primary region via a single namespace. This design allows users to configure their workflow using this namespace without needing to change it during failover.
If a failure or disaster occurs, the secondary region is promoted. The namespace is switched to point to the secondary region, which then becomes the new primary region. The former primary region is downgraded to secondary. Once the new secondary region is reset, it can be promoted again as needed.
The client manages the promotion of the primary region to the secondary via a property. This gives full visibility over issue resolution. Additionally, automatic promotion is possible using associated metrics.
NB: It is not possible to read from or write to the secondary region.
There are two replication modes: synchronous and asynchronous.
Synchronous Mode:
Asynchronous Mode
Without any external event affecting you, you may choose to migrate your Service Bus Namespace to another region. Below is a non-exhaustive list of reasons that might lead to this use of geo-replication:
To do so, you must configure geo-replication on the existing namespace with the desired new region as the secondary region. Once synchronization is complete, the planned promotion of the region begins. Already published messages are then replicated. After the promotion is complete, you can delete the old region—it becomes the new secondary region. Your flows will now run in the desired new primary region.
Geo-replication is most relevant when an unexpected event occurs and impacts the usage of your Service Bus. This can include:
It is then important that your secondary region can take over from the primary region you are using. Additionally, geo-replication ensures continuous synchronization so that your Service Bus remains available and avoids any service disruption. This is achieved through the promotion of the secondary region.
Depending on the severity of the affected services, two types of promotion are possible:
Now that we’ve covered the theory, let’s see how it works in practice. Since geo-replication is still in public preview, check the Microsoft documentation for the latest updates.
The promotion process follows these steps:
You can track replication progress using metrics.
AzureMetrics | where TimeGenerated > ago(1h) | where MetricName == "ReplicationLagDuration"
Microsoft’s new geo-replication feature significantly improves on Geo-disaster recovery by also replicating data, not just metadata.
To use geo-replication effectively, choose the right replication mode:
Since geo-replication is still in public preview, keep an eye on Microsoft updates regarding supported regions.
Microsoft charges the Premium tier for Service Bus per messaging unit. With geo-replication:
🚀 Ready, set, geo-replicate!