He / They

  • 9 Posts
  • 876 Comments
Joined 2 years ago
cake
Cake day: June 16th, 2023

help-circle
  • Might have to break this into a couple replies. because this is a LOT to work through.

    Anthropic is the only company to have admitted publicly to doing this. They were sued and settled out of court. Google and OpenAI have had no such accusations as far as I’m aware.

    Meta is being sued by several groups over this, including porn companies who caught them torrenting. Their defense has been to claim that the 2,400 videos downloaded to their corporate IP space was done for “personal use”.

    OpenAI is also being accused of pirating books (not scraping), and it has been unable to prove legal procurement of them.

    There is no such legal distinction [scraping for summary use vs scraping for supplanting the original content]. Scraping content is legal no matter WTF you plan to do with it.

    Interestingly, it’s actually Meta’s most recent partial win that explicitly helps disproves this. Apart from just generally ripping into Meta for clearly infringing copyright, the judge wrote (page 3)

    There is certainly no rule that when your use of a protected work is “transformative,” this automatically inoculates you from a claim of copyright infringement. And here, copying the protected works, however transformative, involves the creation of a product with the ability to severely harm the market for the works being copied, and thus severely undermine the incentive for human beings to create. Under the fair use doctrine, harm to the market for the copyrighted work is more important than the purpose for which the copies are made.

    So yes, Fair Use absolutely does take into account market harms.

    What an AI model does isn’t copyright infringement (usually).

    I never asserted this, and I am well aware of the distinction between the copyright infringement which involved the illegal obtainment of copyrighted material, and the AI training. You seem to be bringing a whole host of objections you get from others and applying them to me.

    I think it’s perfectly reasonable to require that AI companies legally acquire a copy of any copyrighted material. Just as it would not be legal for me to torrent a movie even if I wanted to do something transformative with it, AI companies should not be able to do so either.



  • Because the same rules that allow Google to train their search with everyone’s copyrighted websites are what allow the AI companies to train their models.

    This is false, by omission. Many of the AI companies have been downloading content through means other than scraping, such as bittorrent, to access and compile copyrighted data that is not publicly scrape-able. That includes Meta, OpenAI, and Google.

    The day we ban ingress of copyrighted works into whatever TF people want is the day the Internet stops working.

    That is also false. Just because you don’t understand the legal distinction between scraping content to summarize in order to direct people to a site (there was already a lawsuit against Google that established this, as well as its boundaries), versus scraping content to generate a replacement that obviates the original content, doesn’t mean the law doesn’t understand it.

    My comment right here is copyrighted. So is yours! I didn’t ask your permission before my Lemmy client downloaded it. I don’t need to ask your permission to use your comment however TF I want until I distribute it. That’s how the law works. That’s how it’s always worked.

    The DMCA also protects the sites that host Lemmy instances from copyright lawsuits. Because without that, they’d be guilty of distribution of copyrighted works without the owner’s permission every damned day.

    And none of this matters, because AI companies aren’t just reading content, they’re taking it and using it for commercial purposes.

    Perhaps you are unaware, but (at least in the US) while it is legal for you to view a video on YouTube, if you download it for offline use that would constitute copyright infringement if the owner objects. The video being public does not grant anyone and everyone the right to use it however they wish. Ditto for something like making an mp3 of a song on Spotify using Audacity.

    People who hate AI are supporting an argument that the movie and music studios made in the 90s: That “downloading is theft.” It is not! In fact, because that is not theft, we’re all able to enjoy the Internet every day.

    First off, I do not hate AI, I use it myself (locally-run). My issue is with AI companies using it to generate profit at the expense of the actual creators whose art AI companies are trying to replace (i.e. not directing people to it, like search results).

    Secondly, no one is arguing that it is theft, they are arguing that it is copyright infringement, which is what all of us are also subject to under the DMCA. So we’re actually arguing that AI companies should be held to the same standard that we are.

    Also, note that AI companies have argued in court (in the case brought by Steven King et al) that their use of copyrighted material shouldn’t fall under DMCA at all (i.e. arguing that it’s not about Fair Use), because their argument is that AI training is not the ‘intended use’ of the source material, so this is not eating into that commercial use. That argument leaves copyright infringement liability intact for the rest of us, while solely exempting them from liability. No thanks.

    Luckily, them arguing they’re apart and separate from Fair Use also means that this can be rejected without affecting Fair Use! Double-win!






  • But gooning is more goal-oriented and more communal. The gooner goons to reach the “goonstate”: a supposed zone of total ego death or bliss that some liken to advanced meditation, the attainment of which compels them to masturbate for hours, or even days, at a time.

    Is this Fox News? I think their GenZ consultant trolled them hard.


  • So far the only real “regulation” I’ve seen governments push vis a vis social media is using it as an excuse to kill anonymity and privacy, rather than doing anything about the harmful content itself. What evidence do we have that governments ‘regulating’ AI is going to actually solve the issues presented, and not be just another backdoor to control their populaces?

    I don’t think there’s nothing to be done, but perhaps I’m just getting a little tired of this strange cognitive disconnect from people rightfully recognizing that governments globally are shifting rightwards, turning into vehicles of oppression and re-stoking the fires of colonial-era racism, paying vast sums to companies like Palantir and other AI-heavy “defense” companies to help surveil their citizens… but then people also going, “hey, you’re totally the right people to help us solve this capitalist problem, right”? Like, if they wanted to help you, they already would be.

    the superintelligence argument has sent governments chasing that red herring as they try to present themselves as being friendly to tech investment to attract a small slice of the trillions of dollars

    Perhaps they are well aware that superintelligence is a distraction? Perhaps they are, in fact, distracting you (royal ‘you’), rather than them being the ones distracted?

    How many people need to be disconnected from reality, siphoned into dependence on chatbots, and put at risk of losing their minds before governments take action against these agents of chaos?

    Bro, who do you think is benefitting from this social control? If you don’t think that your government is using generative-AI already to conduct influence campaigns, you are hopelessly naive.



  • I agree 100%, I’m not arguing it’s a good idea, these are just other arguments than “in order for it to be useful it needs to be able to counter Russia and/or China, otherwise it would be strategically useless and economically infeasible”.

    North Korea is the only one that could fall under that category.

    In the status quo, I still don’t think that’s true; India and Pakistan are both nuclear-equipped, but with moderate-to-low warhead counts that could potentially reach the US. Western European countries have nukes (France and UK), though they both tend to favor SLBMs over land-based ones. If you’re planning to make any of them enemies, it could absolutely be useful.

    An anti-satellite capability is much easier to get than a nuclear ICBM. If they can make a nuke, they can take out a satellite.

    That has not been true so far. There are more countries with nukes than ones with anti-satellite missile systems. Only the US, Russia, China, India, and the UK have demonstrated anti-satellite capabilities.









  • My experience is that OSS security scales upwards based on increased contributors, while commercial software is the inverse.

    A small git* repo with a couple contributors is likely very insecure compared to one with 5000+. An enterprise tool from a company with 70 devs is probably far less bloated and insecure than one from a company with 1000 devs.

    My 2 cents.