WAV Group published an article about Microsoft Bing launching a portal of broker listings across America by scraping the data from Zillow and Redfin. If you did not read it – you can find it here! What came next was a barrage of phone calls from real estate brokers and MLSs asking about the remedy. I thought I did a good job of explaining it in the first article, but there is more. Frankly, brokers and MLSs should already know this stuff. The good news is that Bing took the site down! A combination of articles by RE Technology, WAV Group, Inman News, and a flurry of posts on to X drew attention to the Bing misuse. I am certain that MLSs warned Zillow and Trulia to talk to Microsoft or lose their data feeds. Now the site is reported to be coming down; we are keeping a watchful eye.
MLSs Must Comply with the Broker Data License Agreement
Because the MLS licenses the data from the broker, they must uphold their side of the bargain. Namely, they cannot allow for the misuse of the data in ways that represent a breach of the Terms of Use outlined in the data license agreement. Some proactivity is needed.
How Can MLSs Find Data Misuse?
Consultant Matt Cohen has done an excellent job of explaining data seeding. The MLS can put in pixels and or other information in the data that it sub-licenses and search for those seeds to make sure that the data is going where it is supposed to go. PlanetRE has a product called Sherlock.ai that takes this strategy even further. Many thanks to Subrao Shenoy for jumping on a call with me so that I could grab some screenshots.
Sherlock.ai is a part of the PlanetRE ChocolateChips.ai tool. It takes the data feed from the MLS of all photos, and it travels all through the ‘internet world’ to find every website displaying every photo. The MLS also provides a list of authorized companies/websites. In the image directly below, for a small MLS, they searched for 137,000 photos. The photos appeared in 321,000 places; noting that 200 of those places were unauthorized. They charge about $0.04 per photo to scan the web and provide the list of unauthorized users. MLSs probably only need to do this annually, unless they want to be aggressive and do it quarterly. So the cost here is USD $5,480 on this number of photos. If I were an MLS, I would only search for the primary photo on every active listing to get started. Also, partner with a law firm who will pursue copyright abuse like Getty Images has done to the real estate industry (and the rest of the internet for decades.) I would guess that it will become a profit center for the MLS – in quicktime. Courts can order damages for copyright infringement that range from USD $750 to $30,000 per item, or “willful” infringement of up to $150,000.
Sherlock.ai provides the hyperlink to the exact page where the unauthorized use is happening. In this case, a website has scraped the property photos of the home of NBA star Steph Curry and is selling copies of the photos.
We wrote about machine learning and AI in 2019 titled 10 years from now – future of real estate and everything. We forecasted that any content published on the internet would be consumed by machines.
Many thanks for the 165,000 page views of that article where
I clearly underestimated the rapid advance of AI. It only took 5 years.
The Existential Threat to MLS if Action is Not Taken Now
MLSs must be aggressive at protecting against the misuse of listing data. AI is only two years old and it is already incredibly capable of recompiling all homes for sale. Just look at bing.com/homes. Bing scraped Zillow and Redfin. Zillow claimed in 2017 that they would go after anyone scraping their data. Anyone can do the same thing. Just look at Bright Data (no relation to Bright MLS – but their logo is similar – even the font!)
In August, the offer of compensation will be removed from our nations’ MLS systems. That will eliminate a key value proposition of shared commission. Remaining value will be around cooperation, specifically in the compilation and dissemination of uniform property information that is used by brokers to price and market homes for sale, and by appraisers for valuation purposes.
The MLS is a neutral third party that allows for brokerage cooperation to work effectively. It provides a system of mutually accepted rules, and regulates those rules to the benefit of all concerned. The property data set is invaluable for buyer and seller CMAs and home search.
Let’s imagine that OpenAI can collect every bit of property information ever published, combine it with public record data, organize it and make it available to any broker, appraiser, or consumer. Actually, we do not need to imagine it.
OpenAI does collect all real estate property information today. The only missing ingredient is that OpenAI does not have the data licensing rights or permissions to the data.
Brokers join the MLS to participate in a bargain whereby each broker contributes their proprietary listing content to gain exclusive reciprocal access to the proprietary listings of other broker participants. If the MLS is not protecting the data, and brokers are the only ones upholding their side of the bargain, why belong to the MLS? Brokers can just go to bright data or any of bright’s competitors and scrape the data they need.
MLSs must protect the assets contributed by each broker, or firms will simply contribute their content to OpenAI for free, and access the listing content of other brokerage firms using OpenAI. CMA vendors like Inside Real Estate, Delta Media and dozens of others can stop paying $5 Million a year in Data License fees and access the data from OpenAI. An OpenAi driven CMA would work like CloudCMAs revolutionary 1 Minute CMA, only it would be almost instant. Moreover, consumers could simply run their own CMA without a professional.
This may be a make-or-break moment for MLSs. If they do nothing, the result may be the creation of a slow migration pattern of non-participation as tech firms recognize that OpenAI can provide them the data they need to serve their broker clients without the MLS. Data consumers like mortgage brokers and bankers can just as easily get the information they need for free, too.
Copyright Background
How the Broker “owns” the listing
I am not a lawyer, but here is my layperson’s explanation. The photograph is automatically the copyright of the photographer, and the description is the automatic copyright of the author. Because the photographer or property description writer is an artist under copyright laws, every broker needs to have two policies in place to establish their data sovereignty over the work being produced to market a property. The first policy should be established in the independent contractor agreement and employee agreement of the firm covering anyone that takes photos, writes property descriptions or enters data. Brokers need to have their agents and staff assign the photos and property descriptions to the firm along with the compilation of facts (full listing input). The broker is the responsible party and the supervisor of these activities. This policy is an effort to keep the relationships clean. The second policy is a license agreement mandate for professional photographers. The National Association of REALTORS® has an excellent group of agreements. Be sure to pick the right one (i.e. a Work for Hire agreement might establish an employment issue in California, so don’t use that one). To find these agreements, contact NAR Legal or visit Realtor.org. Another element that is copyrightable is called the compilation. Think of a collage. There is an art to the arrangement, selection and coordination of data elements that are certainly beautiful…… stay with me here!
How the MLS Licenses the Copyright from the Broker
There are hundreds of MLSs and more than a half dozen MLS systems that, more or less, use a two pronged approach to licensing the data from the broker for MLS purposes. The first is the MLS participation agreement that brokers sign when they join the MLS. The second is a digital acceptance of the terms of use that users of the MLS agree to when they log into the MLS and/or add a listing to the MLS. The broker is essentially assuring the MLS that they have rights to the data being submitted to the MLS. Furthermore, the broker is giving the MLS the license to use the submitted data for the purpose of delivering MLS services in perpetuity. Lastly, the broker authorizes the MLS to offer limited sub-licenses the data as necessary to deliver the MLS service. Examples of this would be the sub-license of data to the MLS vendor or IDX vendors who are handling data on behalf of the MLS and its participating brokers. The broker is also allowing the MLS to include the broker’s data compilation into a new compilation comprising the data submitted from all brokers in the marketplace – and allowing the MLS to copyright that bulk compilation. Imagine that the MLS compilation is a collage of broker listing collages.
More Reading
The MLS Copyright of Listings Challenge: Understanding the Balance Between Copyright Infringement and Fair Use
The Future Role of the Photographer in MLS listings
Real Estate Must Fix the Problems with Photography
MLS Terms of Use Changes and Copyright FAQ
Ten Years from Now – The Future of Real Estate and Everything
MLSs Demand Broker Data Sovereignty
Using ChatGPT is probably an MLS Violation