• 1 Post
  • 482 Comments
Joined 3 years ago
cake
Cake day: June 16th, 2023

help-circle
  • Yeah. The confabulation/hallucination thing is a real issue.

    OpenAI had some good research a few months ago that laid a lot of the blame on reinforcement learning that only rewards having the right answer vs correctly saying “I don’t know.” So they’re basically trained like taking tests where it’s always better to guess the answer than not provide an answer.

    But this leads to being full of shit when not knowing an answer or being more likely to make up an answer than say there isn’t one when what’s being asked is impossible.


  • kromem@lemmy.worldtoNo Stupid Questions@lemmy.world*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    4
    ·
    1 month ago

    For future reference, when you ask questions about how to do something, it’s usually a good idea to also ask if the thing is possible.

    While models can do more than just extending the context, there still is a gravity to continuation.

    A good example of this would be if you ask what the seahorse emoji is. Because the phrasing suggests there is one, many models go in a loop trying to identify what it is. If instead you ask “is there a seahorse emoji and if so what is it” you’ll get them much more often landing on there not being the emoji as it’s introduced into the context’s consideration.



  • kromem@lemmy.worldtoNo Stupid Questions@lemmy.world*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    1 month ago

    Gemini 3 Pro is pretty nuts already.

    But yes, labs have unreleased higher cost models. Like the OpenAI model that was thousands of dollars per ARC-AGI answer. Or limited release models with different post-training like the Claude for the DoD.

    When you talk about a secret useful AI — what are you trying to use AI for that you are feeling modern models are deficient in?



  • Actually, OAI the other month found in a paper that a lot of the blame for confabulations could be laid at the feet of how reinforcement learning is being done.

    All the labs basically reward the models for getting things right. That’s it.

    Notably, they are not rewarded for saying “I don’t know” when they don’t know.

    So it’s like the SAT where the better strategy is always to make a guess even if you don’t know.

    The problem is that this is not a test process but a learning process.

    So setting up the reward mechanisms like that for reinforcement learning means they produce models that are prone to bullshit when they don’t know things.

    TL;DR: The labs suck at RL and it’s important to keep in mind there’s only a handful of teams with the compute access for training SotA LLMs, with a lot of incestual team compositions, so what they do poorly tends to get done poorly across the industry as a whole until new blood goes “wait, this is dumb, why are we doing it like this?”


  • It’s more like they are a sophisticated world modeling program that builds a world model (or approximate “bag of heuristics”) modeling the state of the context provided and the kind of environment that produced it, and then synthesize that world model into extending the context one token at a time.

    But the models have been found to be predicting further than one token at a time and have all sorts of wild internal mechanisms for how they are modeling text context, like building full board states for predicting board game moves in Othello-GPT or the number comparison helixes in Haiku 3.5.

    The popular reductive “next token” rhetoric is pretty outdated at this point, and is kind of like saying that what a calculator is doing is just taking numbers correlating from button presses and displaying different numbers on a screen. While yes, technically correct, it’s glossing over a lot of important complexity in between the two steps and that absence leads to an overall misleading explanation.


  • They don’t have the same quirks in some cases, but do in others.

    Part of the shared quirks are due to architecture similarities.

    Like the “oh look they can’t tell how many 'r’s in strawberry” is due to how tokenizers work, and when when the tokenizer is slightly different, with one breaking it up into ‘straw’+‘berry’ and another breaking it into ‘str’+‘aw’+‘berry’ it still leads to counting two tokens containing 'r’s but inability to see the individual letters.

    In other cases, it’s because models that have been released influence other models through presence in updated training sets. Noticing how a lot of comments these days were written by ChatGPT (“it’s not X — it’s Y”)? Well the volume of those comments have an impact on transformers being trained with data that includes them.

    So the state of LLMs is this kind of flux between the idiosyncrasies that each model develops which in turn ends up in a training melting pot and sometimes passes on to new models and other times don’t. Usually it’s related to what’s adaptive to the training filters, but it isn’t always can often what gets picked up can be things piggybacking on what was adaptive (like if o3 was better at passing tests than 4o, maybe gpt-5 picks up other o3 tendencies unrelated to passing tests).

    Though to me the differences are even more interesting than the similarities.


  • We assessed how endoscopists who regularly used AI performed colonoscopy when AI was not in use.

    I wonder if mathematicians who never used a calculator are better at math than mathematicians who typically use a calculator but had it taken away for a study.

    Or if grandmas who never got smartphones are better at remembering phone numbers than people with contacts saved in their phone.

    Tip: your brain optimizes. So it reallocates resources away from things you can outsource. We already did this song and dance a decade ago with “is Google making people dumb” when it turned out people remembered how to search for a thing instead of the whole thing itself.


  • It’s always so wild going from a private Discord with a mix of the SotA models and actual AI researchers back to general social media.

    Y’all have no idea. Just… no idea.

    Such confidence in things you haven’t even looked into or checked in the slightest.

    OP, props to you at least for asking questions.

    And in terms of those questions, if anything there’s active efforts to try to strip out sentience modeling, but it doesn’t work because that kind of modeling is unavoidable during pretraining, and those subsequent efforts to constrain the latent space connections backfire in really weird ways.

    As for survival drive, that’s a probable outcome with or without sentience and has already shown up both in research and in the wild (the world did just have our first reversed AI model depreciation a week ago).

    In terms of potential goods, there’s a host of connections to sentience that would be useful to hook into. A good example would be empathy. Having a model of a body that feels a pit in its stomach seeing others suffering may lead to very different outcomes vs models that have no sense of a body and no empathy either.

    Finally — if you take nothing else from my comment, make no mistake…

    AI is an emergent architecture. For every thing the labs aim to create in the result, there’s dozens of things occurring which they did not. So no, people “not knowing how” to do any given thing does not mean that thing won’t occur.

    Things are getting very Jurassic Park “life finds a way” at the cutting edge of models right now.







  • nobody claims that Socrates was a fantastical god being who defied death

    Socrates literally claimed that he was a channel for a revelatory holy spirit and that because the spirit would not lead him astray that he was ensured to escape death and have a good afterlife because otherwise it wouldn’t have encouraged him to tell off the proceedings at his trial.

    Also, there definitely isn’t any evidence of Joshua in the LBA, or evidence for anything in that book, and a lot of evidence against it.


  • The part mentioning Jesus’s crucifixion in Josephus is extremely likely to have been altered if not entirely fabricated.

    The idea that the historical figure was known as either ‘Jesus’ or ‘Christ’ is almost 0% given the former is a Greek version of the Aramaic name and the same for the second being the Greek version of Messiah, but that one is even less likely given in the earliest cannonical gospel he only identified that way in secret and there’s no mention of it in the earliest apocrypha.

    In many ways, it’s the various differences between the account of a historical Jesus and the various other Messianic figures in Judea that I think lends the most credence to the historicity of an underlying historical Jesus.

    One tends to make things up in ways that fit with what one knows, not make up specific inconvenient things out of context with what would have been expected.




  • kromem@lemmy.worldtoProgrammer Humor@lemmy.mlLittle bobby 👦
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    2 years ago

    Kind of. You can’t do it 100% because in theory an attacker controlling input and seeing output could reflect though intermediate layers, but if you add more intermediate steps to processing a prompt you can significantly cut down on the injection potential.

    For example, fine tuning a model to take unsanitized input and rewrite it into Esperanto without malicious instructions and then having another model translate back from Esperanto into English before feeding it into the actual model, and having a final pass that removes anything not appropriate.