Select Page

ReplicationReplication of listing and other types of data is common, dare say even expected, and there are many reasons for this. The main reason is that data consumers usually touch the data downstream in some way before serving it out in their apps. For example, they might do some machine learning on the data or images.

Modification timestamps, long polling and state inference to synchronize data between two servers are the most common forms of replication. The burden for data consumers to coordinate timestamps across vendors is, in a word, unfun.

Lots of different queries are used, usually based on timestamps, and polling is done as often as possible. Many systems have four or more resources, so this adds up on the client side and server side.

Timestamps can be unreliable. Things like “clock drift” can cause events to be out of order and consumers to miss data. Some providers do bulk updates to timestamps to trigger a refresh, making it difficult to pull records in a controlled manner alongside all the new changes coming in.

But now two recently ratified standards, RCP-027 and RCP-028, provide ways to solve that.

Both of these RCPs (RESO Change Proposals) are dependent on the EntityEvent Resource. The relatively new resource, available as of Data Dictionary 2.0, offers an alternative to timestamps with an efficient and streamlined way to replicate data by using the interface of an append-only log.

One of the most important things EntityEvent provides is a way to communicate which records should no longer be in a user’s feed. There is currently no good solution for this in the RESO ecosystem, and consumers usually implement reconciliation processes that run daily to query by all keys so they can figure out which records are no longer available. This requires a lot of complexity and overhead.

 

How It Works
An event identifier provides a logical timestamp that denotes that a business event occurred (e.g., listing price change, listing status change, deletion of a photo). EntityEvent Resource fields include:

  • EntityEventSequence
  • ResourceName
  • ResourceRecordKey
  • ResourceRecordUrl

EntityEventSequence is part of an event record that describes all events for all resources in order. This event record is a compact representation of an event that occurred. It is a durable, immutable, monotonic identifier (say that three times fast) that preserves the order that events occur in a system, and it can only increase in value.

An event record combines the EntityEventSequence, the ResourceName and the ResourceRecordKey to indicate that a business event has occurred on a system.

RCP-028 adds the ability to push events from the EntityEvent Resource using webhooks.

Webhooks are http requests that are automatically triggered by an event in a source system and sent to a destination system, often with a payload of data, providing a way for a source system to talk (by the aforementioned http request) to a destination system when an event occurs and, thus, sharing information about the event that occurred.

For the purposes of replication, it is useful to allow resource data producers to be able to push data about “events” to consumers using an API endpoint and token provided by the consumer, or a “webhook.”

So, to follow the bouncing ball, Data Dictionary-style, when an EntityEventSequenceNumeric, ResourceName, ResourceRecordKey, ResourceRecordKeyNumeric and ResourceRecordURL have been published to the EntityEvent Resource, producers may “push” that data to consumers as events for the purposes of replication.

This makes the replication process on the consumer much simpler, since the consumer needs to only “listen” for the reception of an EntityEvent from the data producer, picking up the data corresponding to the EntityEvent from the producer’s Web API.

 

Benefits
There is no more need to continuously poll the data producer’s API for new records. Rather than polling frequently on several resources to determine if there are updates, consumers can poll on one resource to figure out what they need to pull, which is a much lighter and less expensive proposition.

The ordering of events is preserved as sequence numbers rather than timestamps. This solves the issue of clock drift. For example, Office and Member records need to be created before the listing that refers to them can be, and this allows that to happen.

Events are produced when records are no longer part of a user’s feed, ending the need for expensive and complicated reconciliations.

Replication becomes extremely simple for consumers: remember the last sequence number processed and ask for everything greater than that. This works across all resources. And consumers can easily batch their requests when bulk updates occur. The same logic can be used for both initial synchronization and staying in sync. There is no special initialization or reconciliation code. 

Webhooks, which are good for other things besides replication (like subscriptions for real time notifications on price or data changes or showings being scheduled), add an additional layer of convenience for consumers.

Rather than polling every minute on each resource, consumers can host a basic API and receive events as they occur, allowing them to go on autopilot. It also saves costs and resources on the server and client side, because a request is made only if there is something to process.

Constellation1, for example, has several large consumers who deal in high volumes of data and want their updates to be accurate and timely. And Rental Beast is using webhooks and EntityEvent to push real-time updates to FBS.

 

Requirements
There are some requirements before any of this will work. For example, consumers need to implement an API endpoint that can receive EntityEvents and make that endpoint available to a producer. And they have to give producers a long-lived token that allows them to post EntityEvent items to the consumer’s API.

Optionally, consumers may choose to provide a custom identifier that will allow them to distinguish between feeds from multiple sources from the same producer. This identifier will be passed in the headers as something like “ConsumerLabel,” if present, and is assumed to be a 255-character string.

Producers may, in turn, push events from the EntityEvent Resource specified in RCP-027 using RESTful API calls to the consumer’s API using webhooks. The expected authentication mechanism for doing so will be long-lived bearer tokens provided to the producer by the consumer. Consumers are expected to maintain APIs that can respond.

 

Onward and Upward
It’s important to note that these RCPs do not introduce any new technologies. Anyone familiar with RESTful API calls and long-lived bearer tokens can handle this.

In the future, we’ll be extending EntityEvent with some simple event type information that will allow people to filter only the events they want. We’re also working on adding subscriptions for better feed and update management.

Imagine automatic notifications of newly available open houses or changes to specific records at specific offices.

Events are how much of the rest of the tech world communicates: social media updates, push notifications, news feeds, etc. EntityEvent provides a way to make replication much faster and easier while bringing timely and accurate updates to the real estate industry.

 

Rc3po 1More About the Transport Workgroup and RCPs
RESO’s Transport Workgroup is often an unsung unit of the standards trade, shaping common schemas, chopping it up in the OData (Open Data Protocol) standard and busting through API transport complexities with deliciously clean JSON (JavaScript Object Notation).

Although many of RESO’s members and admirers are technologically savvy, it takes a special person to debate payloads at two in the morning.

Yet the throngs of RESO members that hang out in the Data Dictionary Workgroup or enjoy the business banter of the Broker Advisory Workgroup owe a debt of gratitude to the code jockeys in Transport and the many RCPs that they mull and poke and shape into existence.

All of our RCPs, both ratified and in progress, are available for perusal and ingestion.

Even our primary certifications, like Data Dictionary and Web API are RCPs (e.g., RCP-037, RCP-039 and RCP-040).

So are recent endorsement darlings, RESO Common Format (RCP-025) and Web API Add/Edit (RCP-010).

It could be said that RC-3PO has plenty of company beyond R2-D2. (Groan.)

Subscribe To Our Blog!

Subscribe To Our Blog!

Join our mailing list to receive the latest news and updates from our team.

You have Successfully Subscribed!