• Affidavit@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    11
    ·
    4 months ago

    I don’t use WhatsApp, but this immediately made me think of my dad who doesn’t use any punctuation and frequently skips and misspells words. His messages are often very difficult to interpret, through no fault of his own (dyslexia).

    Having an LLM do this for me would help both him and me.

    He won’t feel self conscious when I send a, “What you talkin’ about Willis?” message, and I won’t have to waste a ridiculous amount of time trying to figure out what he was trying to say.

    • ayyy@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      41
      ·
      4 months ago

      If he’s not communicating in an explicit and clear way the AI can’t help you magically gain context. It will happily make up bullshit that sounds plausible though.

      • Affidavit@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        14
        ·
        4 months ago

        A poorly designed tool will do that, yes. An effective tool would do the same thing a person could do, except much quicker, and with greater success.

        An LLM could be trained on the way a specific person communicates over time, and can be designed to complete a forensic breakdown of misspelt words e.g. reviewing the positioning of words with nearby letters in the keyboard, or identifying words that have different spellings but may be similar phonetically.

        • Disregard3145@lemmy.world
          link
          fedilink
          English
          arrow-up
          13
          ·
          4 months ago

          the same thing a person could do

          asking for clarification seems like a reasonable thing to do in a conversation.

          A tool is not about to do that because it would feel weird and creepy for it to just take over the conversation.

          • Affidavit@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            edit-2
            4 months ago

            The intent isn’t for the LLM to respond for you, it’s just to interpret a message and offer suggestions on what a message means or rewrite it to be clear (while still displaying the original).

        • Die4Ever@retrolemmy.com
          link
          fedilink
          English
          arrow-up
          10
          ·
          edit-2
          4 months ago

          An LLM could be trained on the way a specific person communicates over time

          Are there any companies doing anything similar to this? From what I’ve seen companies avoid this stuff like the plague, their LLMs are always frozen with no custom training. Training takes a lot of compute, but also has huge risks of the LLM going off the rails and saying bad things that could even get the company into trouble or get bad publicity. Also the disk space per customer, and loading times of individual models.

          The only hope for your use case is that the LLM has a large enough context window to look at previous examples from your chat and use those for each request, but that isn’t the same thing as training.

          • Yaky@slrpnk.net
            link
            fedilink
            English
            arrow-up
            1
            ·
            4 months ago

            My friend works for a startup that does exactly that - trains AIs on conversations and responses from a specific person (some business higher-ups) for purposes of “coaching” and “mentoring”. I don’t know how well it works.

            • Die4Ever@retrolemmy.com
              link
              fedilink
              English
              arrow-up
              1
              ·
              4 months ago

              it probably works pretty well when it’s tested and verified instead of unsupervised

              and for a small pool of people instead of hundreds of millions of users

          • Affidavit@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            4 months ago

            There are plenty of people and organisations doing stuff like this, there are plenty of examples on HuggingFace, though typically it’s to get an LLM to communicate in a specific manner (e.g. this one trained on Lovecraft’s works). People drastically overestimate the amount of compute time/resources training and running an LLM takes; do you think Microsoft could force their AI on every single Windows computer if it was as challenging as you imply? Also, you do not need to start from scratch. Get a model that’s already robust and developed and fine tune it with additional training data, or for a hack job, just merge a LoRA into the base model.

            The intent, by the way, isn’t for the LLM to respond for you, it’s just to interpret a message and offer suggestions on what a message means or rewrite it to be clear (while still displaying the original).

            • Die4Ever@retrolemmy.com
              link
              fedilink
              English
              arrow-up
              3
              ·
              edit-2
              4 months ago

              Huggingface isn’t customer-facing, it’s developer-facing. Letting customers retrain your LLM sounds like a bad idea for a company like Meta or Microsoft, it’s too risky and could make them look bad. Retraining an LLM for Lovecraft is a totally different scale than retraining an LLM for hundreds of millions of individual customers.

              do you think Microsoft could force their AI on every single Windows computer if it was as challenging as you imply?

              It’s a cloned image, not unique per computer

              • Affidavit@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                edit-2
                4 months ago

                Hugging Face being developer-facing is completely irrelevant considering the question you asked was whether I was aware of any companies doing anything like this.

                Your concern that companies like Meta and Microsoft are too scared to let users retrain their models is also irrelevant considering both of these companies have already released models so that anyone can retrain or checkpoint merge them i.e. Llama by Meta and Phi by Microsoft.

                It’s a cloned image, not unique per computer

                Microsoft’s Copilot works off a base model, yes, but just an example that LLMs aren’t as CPU intensive as made out to be. Further automated finetuning isn’t out of the realm of possibility either and I fully expect Microsoft to do this in the future.

                • Die4Ever@retrolemmy.com
                  link
                  fedilink
                  English
                  arrow-up
                  3
                  ·
                  edit-2
                  4 months ago

                  Your concern that companies like Meta and Microsoft are too scared to let users retrain their models is also irrelevant considering both of these companies have already released models so that anyone can retrain or checkpoint merge them i.e. Llama by Meta and Phi by Microsoft.

                  they release them to developers, not automatically retrain them unsupervised in their actual products and put them in the faces of customers to share screenshots of the AI’s failures on social media and give it a bad name

    • Feyd@programming.dev
      link
      fedilink
      English
      arrow-up
      20
      ·
      4 months ago

      What makes you think the llm will be able to decipher something that already doesn’t make sense