Is there anyway to make it use less at it gets more advanced or will there be huge power plants just dedicated to AI all over the world soon?

  • Em Adespoton@lemmy.ca
    link
    fedilink
    arrow-up
    14
    arrow-down
    6
    ·
    23 hours ago

    Supercomputers once required large power plants to operate, and now we carry around computing devices in out pockets that are more powerful than those supercomputers.

    There’s plenty of room to further shrink the computers, simplify the training sets, formalize and optimize the training algorithms, and add optimized layers to the AI compute systems and the I/O systems.

    But at the end of the day, you can either simplify or throw lots of energy at a system when training.

    Just look at how much time and energy goes into training a child… and it’s using a training system that’s been optimized over hundreds of thousands of years (and is still being tweaked).

    AI as we see it today (as far as generative AI goes) is much simpler, just setting up and executing probability sieves with a fancy instruction parser to feed it its inputs. But it is using hardware that’s barely optimized at all for the task, and the task is far from the least optimal way to process data to determine an output.

    • null_dot@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      6
      ·
      16 hours ago

      Supercomputers once required large power plants to operate, and now we carry around computing devices in out pockets that are more powerful than those supercomputers.

      This is false. Supercomputers never required large [dedicated] power plants to operate.

      Yes they used a lot of power, yes that has reduced significantly, but it’s not at the same magnitude as AI

    • BussyCat@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      22 hours ago

      It is also a very large data set it has to go through the average English speaker knows 40kish words and it has to pull from a large data set and attempt to predict what’s the most likely word to come next and do that a hundred or so times per response. Then most people want the result in a very short period of time and with very high accuracy (smaller tolerances on the convergence and divergence criteria) so sure there is some hardware optimization that can be done but it will always be at least somewhat taxing.