You hear a lot of things at real estate conferences. A notion Marilyn Wilson most recently heard while at NAR REALTOR Legislative Meetings was that OpenAI owns the content that you provide them. As someone extremely interested in most things AI, I immediately did some research. Where did I start? Naturally, OpenAI’s Terms of Use – Section 3 (https://openai.com/policies/terms-of-use).
The important details are summarized below:
“(a) Your Content. You may provide input to the Services (“Input”), and receive output generated and returned by the Services based on the Input (“Output”). Input and Output are collectively “Content.” As between the parties and to the extent permitted by applicable law, you own all Input. Subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms. OpenAI may use Content to provide and maintain the Services, comply with applicable law, and enforce our policies. You are responsible for Content, including for ensuring that it does not violate any applicable law or these Terms.”
The blue portion very clearly states that the user owns both the Input and the Output. But it also says, “OpenAI may use Content to provide and maintain the Services, comply with applicable law, and enforce our policies.” So what does that mean? Great question – it depends on where you live. OpenAI’s ChatGPT has become an international sensation, so it really depends on your location and what the laws are in your area.
Interestingly noted in this article on JDSupra, OpenAI may not have the right to assign all these rights to you.
“It should be noted that OpenAI only assigns all “its” right, title, and interest in and to the output to the user. However, if OpenAI does not initially own the rights, it cannot assign them. Another related issue is that other ChatGPT users can narrow the assignment scope. OpenAI’s Terms of Use state that due to the nature of machine learning, many users may receive identical or similar outputs from ChatGPT.³ This leads to another set of issues, including determining original ownership rights (if available), the potential destruction of ownership rights over time, and murky usage rights in the future.”
Unsurprisingly, who owns the rights to the content is as clear as mud. But what we do know is that the content is probably not considered intellectual property (IP) if it was exclusively generated by AI. A certain amount of the content must be created by a human. Of course, this begs another question: what’s the difference? For example, I took this article and had ChatGPT proofread it (oh the irony). So most of this article was written by a human but all of it is considered Output.
The key takeaway is simple: the ownership of the input and/or output of AI-generated content is based on where you live, who you are, what tools you use, etc. To determine the ownership of your content, check with your lawyer.
Great article! Thank you.
I’m actually the person that stated this on stage at Midyear.
After I got back, I re-looked into their updated TOSs again. Turns out in March, they changed their policy from opt out of them getting your data to opt in.
“By default, data submitted by customers after March 1, 2023 via API will not be used to train or improve the models. This has changed in March 2023. Data submitted to the API prior to March 1, 2023 may have been used for improvements if the customer had not previously opted out of sharing data. Data submitted for fine-tuning will only be used to fine-tune the customer’s model after March 1, 2023.”
So they are not using your data any longer which is new since 3/1/2023.
But the reason for that is now clear ->
When they implemented the new Premium tier for subscribers, they enabled a web search system to find the data by open scraping of the web (similar to a web crawler). Whereas in the GPT3 their system did not crawl the internet upon a query, the new version can have that option turned on. So for market data queries, listing lookup etc, before it would say that it can’t answer those questions. With the new web scraping system, it’s going out to websites to find that data when you make the query so it doesn’t need the data in it’s LLM anymore to give you an answer.