• 0 Posts
  • 30 Comments
Joined 1 year ago
cake
Cake day: June 19th, 2023

help-circle



  • Also not a lawyer but maybe more familiar with IP law than you are?

    When an AI scrapes the post you just wrote… how exactly were you, the author of the post, harmed by that action? You weren’t harmed which is a powerful fair use defence. It’s not enough on it’s own, but it’s a huge step in that direction and other factors such as transforming the original add to that making a compelling case.

    Consider the most recent fair use case, which was Google had negotiations to pay license fees for Java, then refused to pay — instead Google created a copy of Java. It dragged on in court a long time and bounced back and forth on apeal, but in the end the ruling came down to “java is protected by copyright, but Sun was not sufficiently harmed, therefore it was fair use”. Or at least that’s where it was headed when Oracle (who bought Sun years after the infringement happened) decided to stop burning mountains of cash fighting a lawsuit that wasn’t likely to end well for them.

    I was somewhat surprised by that case - I felt the fact that Google had talks about paying, then decided not to pay, was pretty clear harm. But the judge didn’t see that as real harm - Java’s source code is not ‘free as in freedom’ but it is ‘free as in dollars’ to download and therefore not really properly protected by copyright. The fact the license added restrictions to what you can do with the copy you were given for free didn’t hold up in court (which has pretty widespread ramifications for GPL… I wonder who will be brave enough to test that in court… the FSF isn’t going to back down from a lawsuit like Oracle did).

    Anyway, if Java is borderline, I think the fediverse is clear cut. Almost any copy of the fediverse would be fair use. Yes, it’s technically copyrighted content, but there’s a loophole so big it surrounds the entire universe.


  • Assuming they don’t want to end up like Aaron Swartz… I’m guessing they are deleted. Unfortunately it’s just too expensive to try to fight DMCA notices. Kim Dotcom has been trying that for a full decade now - his legal costs have surely stretched into tens of millions of dollars and he’s lost pretty much every step of the way. All the money he’s burned on this has delayed a lengthy prison sentence, he’s unlikely to win and would have got a lighter sentence with an early guilty plea.

    As to the tooling… AFAIK it’s not possible to delete some things in Lemmy. I expect they’ve fixed that now. At least for things that are likely to be on the receiving end of a DMCA notice.



  • The specific attack they were talking about involved 126.9 million network requests per second, over a sustained period of time, and it was a widespread attack where the source was millions of individual computers, suspected to be regular desktop PCs from (or adjacent to) China. In other words the attack involved malware that was rapidly installed on vast numbers of computers at the same time.

    Due to the massive size of the attack, it was investigated thoroughly and the only sensible conclusion was that it was state sponsored. Specifically China likely to have used their widespread censorship tools to install malware that quietly attacked Github, likely without the owner of the PC from even knowing it had happened (the attack wasn’t serious enough to disrupt the infected PC)…

    That’s not “hating Chinese” it’s just pointing out a simple fact. Some DDoS attacks are state sponsored. And only a small number of states gate involved in such attacks.


  • Putting a load balancer up in front of a few servers isn’t going to do anything to their database

    Yes it is. Suddenly your database exists in more than one location, which is extremely difficult to do with reasonable performance.

    load balancing doest automatically mean “do something stupid like spin up 100 app servers when we normally use 3”

    Going from 3 to 100 is trivial. Going from one to any number greater than one is the challenge.

    All you’ve described is a need for a db proxy in the off chance that Lemmy code has horrible access patterns for db transactions.

    Define “horrible”?

    When Lemmy, or any server side software is running on a single server, you generally upgrade the hardware before moving to multiple servers (because upgrading is cheaper). When that stops working, and you need to move to another server, it’s possible everything in the database that matters (possibly the entire database) will be in L4 cache in the CPU - not even in RAM a lot of it will be in the CPU.

    When you move to multiple servers, suddenly a lot of frequent database operations are on another server, which you can only reach over a network connection. Even the fastest network connection is dog slow compared to L4 cache and it doesn’t really matter how well written your code is, if you haven’t done extensive testing in production with real world users (and actively malicious bots) placing your systems under high load, you will have to make substantial changes to deal with a database that is suddenly hundreds of millions of times slower.

    The database might still be able to handle the same number of queries per second, but each individual query will take a lot longer, which will have unpredictable results.

    The other problem is you need to make sure all of your servers have the same content. Being part of the Fediverse though, Lemmy probably already has a pretty good architecture for that.




  • I’m old-school developer/programmer and it seems that web is peace of sheet. Basic security stuff violated:

    I’m a modern web developer who used to be an old-school one.

    User provided content (post using custom emojis) caused havoc when processing (doesn’t matter if on server or on client). This is lack of sanitization of user-provided-data.

    Yeah - pretty much, though there are some mitigating factors.

    Strictly speaking, it was the alt text for the emoji. Alt text is HTML, and rather than allow arbitrary HTML they allowed another language called Markdown. Markdown is “a plain text” language with human readable syntax specifically designed to be converted into HTML.

    Markdown is the right format to use for emoji alt texts, but you do need to be careful of one thing - the original purpose of Markdown was to allow HTML content to be easier to write/read and it is a superset of the HTML language. So arbitrary HTML is valid markdown.

    Virtually all modern Markdown parsers disable arbitrary HTML by default, but it’s a behaviour which can be changed and that leaves potential for mistakes like this one here. Specifically the way Lemmy injected emojis with alt text into the Markdown content allowed arbitrary HTML.

    This wasn’t an obvious mistake - the issue over on Lemmy’s issue tracker is titled “Possible XSS Attack” because they knew there was an XSS Attack somewhere and they weren’t immediately sure if they had found it in the emoji system. Even now reading the diff to fix the vulnerability, it still isn’t obvious to me what they did wrong.

    It’s fairly complex code and complexity is the enemy of security… but sometimes you have to do complex things. Back in the “old-school” days, nobody would have even attempted to write something as complicated as a federated social network…

    JavaScript (TypeScript) has access to cookies (and thus JWT). This should be handled by web browser, not JS.

    Yeah - the Lemmy developers made a mistake there. There are a few things they aren’t doing right around cookies and JWT tokens.

    Hopefully they fix it. I expect they will… It was already actively being discussed before this incident, and those discussions have been seen by a lot more people now.

    How the attacker got those JWTs? JavaScript sent them to him? Web browser sent them to him when requesting resources form his server? This is lack of site isolation, one web page should not have access to other domains, requesting data form them or sending data to them.

    There are several levels of isolation that could have blocked this:

    1. Users should not be able to inject arbitrary HTML.
    2. A flag on the page should be set telling the browser to ignore JavaScript in the body of the page - this is a relatively new feature in the web and disabled by default for obvious backwards compatibility reasons, but it should be set especially on a high value target like Lemmy, and I expect once it’s been around a little longer browsers will enable it by default.
    3. A flag should have been set to block JavaScript from contacting an unknown third party domain. Again, this isolation is a relatively new web feature and currently disabled by default.
    4. As you say, JavaScript shouldn’t be able to access the JWT token or the cookie. That’s not a new feature in the web, it’s just one Lemmy developers didn’t take advantage of (I don’t know why)
    5. Even if all of those previous levels of isolation failed… there are things Lemmy should be doing to mitigate the attack. In particular instance admins have had to manually reset JWT tokens. Those tokens should have expired somehow on their own - possibly the moment the attacker tried to use them.

    The attacker logged-in as admin and caused havoc. Again, this should not be possible, admins should have normal level of access to the site, exactly the same as normal users do. Then, if they want to administer something, they should log-in using separate username + password into separate log-in form and display completely different web page, not allowing them to do the actions normal users can do. You know, separate UI/applications for users and for admins.

    Yep - the modern best practice is for admins to manage the site via a completely different system. That adds considerable complexity and cost though, so it’s rarely done unfortunately. But you know, Lemmy is open source… so if someone wants to take on that work they can do it.

    I’ll add one more - it should have taken less time to close the exploit… but given this is the first serious exploit I’ll forgive that.

    Ultimately several of failures contributed to this attack. I expect many of those failures will be corrected in the coming weeks, and that will make Lemmy far more secure than it is right now - so that next time there’s a bug like the one in the Markdown parser it isn’t able to cause so much disruption.

    The good news is no harm was done, and a lot of people are going to learn some valuable lessons as a result of this incident. Ultimately the outcome is a positive one in my opinion.






  • Threads already has over 30 million daily active users and growing fast - I’m tipping it will be over a billion in a year or two.

    The fediverse has 2 million monthly active users. Sorry, but we’ve already lost the content battle. Like it or not, Threads is king king and Lemmy/Mastodon are ants.

    Regarding “two taps and you’re signed up”… that’s just never going to happen. If anything, it probably needs to be a bit harder to sign up. We don’t want people using throwaway accounts.


  • Meta has practically unlimited resources. They will make access to the fediverse fast with their top tier servers.

    They absolutely have limits. For example Threads isn’t in the EU yet, because of strict controls that severely limit what Meta can do.

    As per my understanding this will make small instances less desirable to the common user.

    Small instances are already undesirable to the general public and always will be.

    Meta can and will unethically defedrate from instances which are a theat to them.

    No they can’t. The EU will only allow them to “ethically” defederate.

    When majority of the content is on the Meta servers they can and will provide fast access to it and unethically slow down access to the content from outside instances. This will be noticeable but cannot be proved

    If Threads is slow, people will switch to another service that is fast.

    This is just what i could think of, there are many more ways to be evil. Meta has the best engineers in the world who will figure out more discrete and impactful ways to harm the small instances.

    If they ask their best engineers to do something evil, most of them will quit. Why work for an evil corp when you can work almost anywhere you want?

    Also they don’t have the best in the world - those already left (or refused to work there in the first place).

    Privacy: I know they can scrape data from the fediverse right now. That’s not a problem. The problem comes when they launch their own Android / iOS app and collect data about my search and what kind of Camel milk I like.

    At least on iOS, that type of cross app tracking doesn’t work anymore (unless the user opts into it, which nobody ever does). Apple’s change to how tracking works is costing Meta billions of dollars… and protecting the privacy of about a billion people. Yay Apple.

    But more to the point, people who are worried about privacy will only install Threads if it’s the only way to reach thier friends/family. Since Threads will be federated, they won’t ahve that reason.

    I have Facebook and Facebook Messenger on my phone and once Threads is federated I will be enouraging all my friends to sign up for Threads, so I can reach them. If my Mastodon instance defederates Threads, I’ll be leaving that instance (Lemmy, on the other hand, I might not care so much).

    My thoughts: I think building our own userbase is better than federating with an evil corp.

    Better in what way? One of my metrics is being able to contact people who will not sign up for Mastodon.

    I love the fediverse specifically because it allows me to reach people on other instances. Defederating should be limited to harmful content (and I don’t see any evidence of harm in Thread).

    We couldn’t get the people to use Signal. This is our chance to make a change.

    Even I won’t use Signal. Talk to me when I can install it on both my phones, instead of just one of them (using the same account on both phones).


    Finishing on a more positive note - Threads is going to be full of ads. I think a lot of people won’t be OK with that… and if threads is federated, then people will sign up for small instances like this one. I think we’ll be fine.