It is a problem
-
(From CP newsletter) AI companies have all kinds of arguments against paying for copyrighted content - The Verge[^] There is a real point in there about tracking down all of the owners and what they would actually be paid. As an example consider the following real case... You might know Lovecraft and you might even know of Cthulhu. But have you read the stories? I suspect not many people have. Perhaps because they were often (always?) a grind to get through. There were other authors for that genre but can you name any of them? Years ago I read a small publication that did gothic horror. The editor of that publication (small magazine) made the point that he wanted to preserve all of those that he could find. Because some had not be published in any form for decades. And the original source material was being destroyed (books being trashed over time). He could publish the stuff again. Thus preserving it. But tracking down the copyright owners was an impossible task. It would just cost too much just to do only that. And then paying them anything at all just was not going to happen. (Having done some publishing myself I am rather certain this magazine was not making much at all.) And there are very few other people even concerned about that. -------------------------------- So now back to the link above. Consider that googling suggests there are 28 million public repos on github. Who owns them? How do you contact each of them? Github is of course not the only code repository either. Also consider how many of those repos you have used. And did you check the license? Did you pay them? Do you think everyone in your company did? Now perhaps AI is just learning how to construct web pages. Googling says there are 200 million active/maintained websites. But a total of 1 billion. How does one track down the owner of a website that isn't even being maintained? There are 2 million accessible books on the internet. Presumably most name the author. But is any other information provided? Used to be that any book actually in print in the US had the Library of Congress id but I have certainly seen a number of books where that is no longer true. One book I saw did not even have a 'title' page - although it probably did have a web contact of some sort in the
-
(From CP newsletter) AI companies have all kinds of arguments against paying for copyrighted content - The Verge[^] There is a real point in there about tracking down all of the owners and what they would actually be paid. As an example consider the following real case... You might know Lovecraft and you might even know of Cthulhu. But have you read the stories? I suspect not many people have. Perhaps because they were often (always?) a grind to get through. There were other authors for that genre but can you name any of them? Years ago I read a small publication that did gothic horror. The editor of that publication (small magazine) made the point that he wanted to preserve all of those that he could find. Because some had not be published in any form for decades. And the original source material was being destroyed (books being trashed over time). He could publish the stuff again. Thus preserving it. But tracking down the copyright owners was an impossible task. It would just cost too much just to do only that. And then paying them anything at all just was not going to happen. (Having done some publishing myself I am rather certain this magazine was not making much at all.) And there are very few other people even concerned about that. -------------------------------- So now back to the link above. Consider that googling suggests there are 28 million public repos on github. Who owns them? How do you contact each of them? Github is of course not the only code repository either. Also consider how many of those repos you have used. And did you check the license? Did you pay them? Do you think everyone in your company did? Now perhaps AI is just learning how to construct web pages. Googling says there are 200 million active/maintained websites. But a total of 1 billion. How does one track down the owner of a website that isn't even being maintained? There are 2 million accessible books on the internet. Presumably most name the author. But is any other information provided? Used to be that any book actually in print in the US had the Library of Congress id but I have certainly seen a number of books where that is no longer true. One book I saw did not even have a 'title' page - although it probably did have a web contact of some sort in the
jschell wrote:
how many of those repos you have used.
Zero. Done.
-
(From CP newsletter) AI companies have all kinds of arguments against paying for copyrighted content - The Verge[^] There is a real point in there about tracking down all of the owners and what they would actually be paid. As an example consider the following real case... You might know Lovecraft and you might even know of Cthulhu. But have you read the stories? I suspect not many people have. Perhaps because they were often (always?) a grind to get through. There were other authors for that genre but can you name any of them? Years ago I read a small publication that did gothic horror. The editor of that publication (small magazine) made the point that he wanted to preserve all of those that he could find. Because some had not be published in any form for decades. And the original source material was being destroyed (books being trashed over time). He could publish the stuff again. Thus preserving it. But tracking down the copyright owners was an impossible task. It would just cost too much just to do only that. And then paying them anything at all just was not going to happen. (Having done some publishing myself I am rather certain this magazine was not making much at all.) And there are very few other people even concerned about that. -------------------------------- So now back to the link above. Consider that googling suggests there are 28 million public repos on github. Who owns them? How do you contact each of them? Github is of course not the only code repository either. Also consider how many of those repos you have used. And did you check the license? Did you pay them? Do you think everyone in your company did? Now perhaps AI is just learning how to construct web pages. Googling says there are 200 million active/maintained websites. But a total of 1 billion. How does one track down the owner of a website that isn't even being maintained? There are 2 million accessible books on the internet. Presumably most name the author. But is any other information provided? Used to be that any book actually in print in the US had the Library of Congress id but I have certainly seen a number of books where that is no longer true. One book I saw did not even have a 'title' page - although it probably did have a web contact of some sort in the
What they're objecting to, specifically, is paying to use copyrighted material when training AI. This is interesting because anyone can borrow a book from a library and "train" themselves without paying the copyright owner. The counterargument for AI would have to be bandwidth. A trained person can handle one client at a time, whereas an AI program is easily replicated to handle a large number.
Robust Services Core | Software Techniques for Lemmings | Articles
The fox knows many things, but the hedgehog knows one big thing. -
(From CP newsletter) AI companies have all kinds of arguments against paying for copyrighted content - The Verge[^] There is a real point in there about tracking down all of the owners and what they would actually be paid. As an example consider the following real case... You might know Lovecraft and you might even know of Cthulhu. But have you read the stories? I suspect not many people have. Perhaps because they were often (always?) a grind to get through. There were other authors for that genre but can you name any of them? Years ago I read a small publication that did gothic horror. The editor of that publication (small magazine) made the point that he wanted to preserve all of those that he could find. Because some had not be published in any form for decades. And the original source material was being destroyed (books being trashed over time). He could publish the stuff again. Thus preserving it. But tracking down the copyright owners was an impossible task. It would just cost too much just to do only that. And then paying them anything at all just was not going to happen. (Having done some publishing myself I am rather certain this magazine was not making much at all.) And there are very few other people even concerned about that. -------------------------------- So now back to the link above. Consider that googling suggests there are 28 million public repos on github. Who owns them? How do you contact each of them? Github is of course not the only code repository either. Also consider how many of those repos you have used. And did you check the license? Did you pay them? Do you think everyone in your company did? Now perhaps AI is just learning how to construct web pages. Googling says there are 200 million active/maintained websites. But a total of 1 billion. How does one track down the owner of a website that isn't even being maintained? There are 2 million accessible books on the internet. Presumably most name the author. But is any other information provided? Used to be that any book actually in print in the US had the Library of Congress id but I have certainly seen a number of books where that is no longer true. One book I saw did not even have a 'title' page - although it probably did have a web contact of some sort in the
I just think Musk is an even bigger "goof" now for appropriating "grok" and "grokking". So far, I have yet to meet anyone else that made the connection. Or cares. Sad.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
-
I just think Musk is an even bigger "goof" now for appropriating "grok" and "grokking". So far, I have yet to meet anyone else that made the connection. Or cares. Sad.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
Heinlein's Stranger in a Strange Land. "Grok" is a total misnomer for something that doesn't really understand anything.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.
-
(From CP newsletter) AI companies have all kinds of arguments against paying for copyrighted content - The Verge[^] There is a real point in there about tracking down all of the owners and what they would actually be paid. As an example consider the following real case... You might know Lovecraft and you might even know of Cthulhu. But have you read the stories? I suspect not many people have. Perhaps because they were often (always?) a grind to get through. There were other authors for that genre but can you name any of them? Years ago I read a small publication that did gothic horror. The editor of that publication (small magazine) made the point that he wanted to preserve all of those that he could find. Because some had not be published in any form for decades. And the original source material was being destroyed (books being trashed over time). He could publish the stuff again. Thus preserving it. But tracking down the copyright owners was an impossible task. It would just cost too much just to do only that. And then paying them anything at all just was not going to happen. (Having done some publishing myself I am rather certain this magazine was not making much at all.) And there are very few other people even concerned about that. -------------------------------- So now back to the link above. Consider that googling suggests there are 28 million public repos on github. Who owns them? How do you contact each of them? Github is of course not the only code repository either. Also consider how many of those repos you have used. And did you check the license? Did you pay them? Do you think everyone in your company did? Now perhaps AI is just learning how to construct web pages. Googling says there are 200 million active/maintained websites. But a total of 1 billion. How does one track down the owner of a website that isn't even being maintained? There are 2 million accessible books on the internet. Presumably most name the author. But is any other information provided? Used to be that any book actually in print in the US had the Library of Congress id but I have certainly seen a number of books where that is no longer true. One book I saw did not even have a 'title' page - although it probably did have a web contact of some sort in the
I've read them. Lovecraft was a fascist
-
(From CP newsletter) AI companies have all kinds of arguments against paying for copyrighted content - The Verge[^] There is a real point in there about tracking down all of the owners and what they would actually be paid. As an example consider the following real case... You might know Lovecraft and you might even know of Cthulhu. But have you read the stories? I suspect not many people have. Perhaps because they were often (always?) a grind to get through. There were other authors for that genre but can you name any of them? Years ago I read a small publication that did gothic horror. The editor of that publication (small magazine) made the point that he wanted to preserve all of those that he could find. Because some had not be published in any form for decades. And the original source material was being destroyed (books being trashed over time). He could publish the stuff again. Thus preserving it. But tracking down the copyright owners was an impossible task. It would just cost too much just to do only that. And then paying them anything at all just was not going to happen. (Having done some publishing myself I am rather certain this magazine was not making much at all.) And there are very few other people even concerned about that. -------------------------------- So now back to the link above. Consider that googling suggests there are 28 million public repos on github. Who owns them? How do you contact each of them? Github is of course not the only code repository either. Also consider how many of those repos you have used. And did you check the license? Did you pay them? Do you think everyone in your company did? Now perhaps AI is just learning how to construct web pages. Googling says there are 200 million active/maintained websites. But a total of 1 billion. How does one track down the owner of a website that isn't even being maintained? There are 2 million accessible books on the internet. Presumably most name the author. But is any other information provided? Used to be that any book actually in print in the US had the Library of Congress id but I have certainly seen a number of books where that is no longer true. One book I saw did not even have a 'title' page - although it probably did have a web contact of some sort in the
-
(From CP newsletter) AI companies have all kinds of arguments against paying for copyrighted content - The Verge[^] There is a real point in there about tracking down all of the owners and what they would actually be paid. As an example consider the following real case... You might know Lovecraft and you might even know of Cthulhu. But have you read the stories? I suspect not many people have. Perhaps because they were often (always?) a grind to get through. There were other authors for that genre but can you name any of them? Years ago I read a small publication that did gothic horror. The editor of that publication (small magazine) made the point that he wanted to preserve all of those that he could find. Because some had not be published in any form for decades. And the original source material was being destroyed (books being trashed over time). He could publish the stuff again. Thus preserving it. But tracking down the copyright owners was an impossible task. It would just cost too much just to do only that. And then paying them anything at all just was not going to happen. (Having done some publishing myself I am rather certain this magazine was not making much at all.) And there are very few other people even concerned about that. -------------------------------- So now back to the link above. Consider that googling suggests there are 28 million public repos on github. Who owns them? How do you contact each of them? Github is of course not the only code repository either. Also consider how many of those repos you have used. And did you check the license? Did you pay them? Do you think everyone in your company did? Now perhaps AI is just learning how to construct web pages. Googling says there are 200 million active/maintained websites. But a total of 1 billion. How does one track down the owner of a website that isn't even being maintained? There are 2 million accessible books on the internet. Presumably most name the author. But is any other information provided? Used to be that any book actually in print in the US had the Library of Congress id but I have certainly seen a number of books where that is no longer true. One book I saw did not even have a 'title' page - although it probably did have a web contact of some sort in the
jschell wrote:
And did you check the license?
I always check the license, and I tend to avoid things that aren't licensed unless I know from context that it's public domain - even then maybe that's not the best practice. :~ But yeah, I want people to respect my licenses so I try to respect other people's. As far as the rest of your comment, mostly it just gives me another reason to sideeye "AI"
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
jschell wrote:
And did you check the license?
I always check the license, and I tend to avoid things that aren't licensed unless I know from context that it's public domain - even then maybe that's not the best practice. :~ But yeah, I want people to respect my licenses so I try to respect other people's. As far as the rest of your comment, mostly it just gives me another reason to sideeye "AI"
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
honey the codewitch wrote:
I always check the license,
So do I. But so far I have never worked with anyone else that does. I have worked with multiple people that are not even clear on what I am talking about what I bring it up. Additionally there are sub-dependent libraries which are often ignored as well.
-
What they're objecting to, specifically, is paying to use copyrighted material when training AI. This is interesting because anyone can borrow a book from a library and "train" themselves without paying the copyright owner. The counterargument for AI would have to be bandwidth. A trained person can handle one client at a time, whereas an AI program is easily replicated to handle a large number.
Robust Services Core | Software Techniques for Lemmings | Articles
The fox knows many things, but the hedgehog knows one big thing.Greg Utas wrote:
This is interesting because anyone can borrow a book from a library
With an actual book. The library bought it or if donated someone did. Only one person can read it at time. Libraries are attempting to have electronic access. But publishers have been challenging that. Even if libraries limit the number of copies that are 'checked' out the publishers are challenging that also.
-
honey the codewitch wrote:
I always check the license,
So do I. But so far I have never worked with anyone else that does. I have worked with multiple people that are not even clear on what I am talking about what I bring it up. Additionally there are sub-dependent libraries which are often ignored as well.
Gosh that seems irresponsible. And they do this for commercial code? I'm assuming so. Yikes! :~ Man, that really is begging for trouble, to my mind, but then I always assume the source code I produce will eventually find it's way into the world so I assume someone will see it. I mean, I don't need that to keep me honest, but it keeps me honest, if that makes sense.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
Gosh that seems irresponsible. And they do this for commercial code? I'm assuming so. Yikes! :~ Man, that really is begging for trouble, to my mind, but then I always assume the source code I produce will eventually find it's way into the world so I assume someone will see it. I mean, I don't need that to keep me honest, but it keeps me honest, if that makes sense.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix