Today, AI is rapidly changing the way we build software, and the pace of that change is only accelerating. If our goal is to make programming more productive, then building at the frontier of AI and software feels like the highest-leverage thing we can do.

It is increasingly clear to me that Codex is that frontier. And by bringing Astral’s tooling and expertise to OpenAI, we’re putting ourselves in a position to push it forward. After joining the Codex team, we’ll continue building our open source tools, explore ways they can work more seamlessly with Codex, and expand our reach to think more broadly about the future of software development.

  • Eager Eagle@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    20 hours ago

    It’s manageable if you pass along these rules to the LLM. I’ve actually had more success asking the LLM than giving code reviews to some interns, and even someone who coded professionally before.

    • badgermurphy@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      Right, but aren’t the interns in training specifically to get better at that than they are today, and eventually surpass the abilities of the AI?

      These LLMs are at best OK at this stuff, and are not improving at any sort of convincing rate. If you don’t train anyone to be better than the LLM, the retirement of your generation will make the whole industry you’re in at best, OK at its job.

      • Eager Eagle@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 hours ago

        Ok, I’m not suggesting replacing humans with AI and I despise companies trying to do this unsustainable practice.

        With that out of the way, I’ll restate that LLMs follow some rules more reliably than humans today. It’s also easier to give feedback when you don’t have to worry about coming across as a pedantic prick for pointing out the smaller things.

        On your point that LLMs are not improving; well, agents and tooling are definitely improving. 6 months ago I would need to babysit an agent to implement a moderately complex feature that touches a handful of files. Nowadays, not as much. It might get some things wrong, but usually because it lacks context rather than ability. They can write tests, run them, and iterate until it passes, then I can just look at the diff to make sure the tests and solution makes sense. Again, something that would fail to yield decent results just in the last year.

        • badgermurphy@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          18 minutes ago

          When I refer to improvements, I mean fundamental improvements to the underlying technology, which appear to be at a stubborn plateau.

          I believe the improvements you’re referring to are better guardrails. They are still improving the interface with regard to context and scope, as those functionalities are separate from the underlying technology, bolted on top of it to keep it on task and more continually aware of and operating within the defined context.

          Underneath, though, each new model appears to be a refactoring of the previous one to get different sometimes better results, but the methodology is the same, and its strengths and weaknesses remain largely unchanged.

          So, essentially what my objection to this practice is this:

          This technology has led to companies leaning harder on their current people to get more done with the same amount of time with AI tools. That doesn’t seem to be successful at any sort of scale so far, but that’s the plan nonetheless. As a result, new talent is coming into the industry at a much slower rate than before–hiring is on hold while everyone waits to see if these tools really can replace bodies in the workforce in a serious way (again, super inconclusive at this point).

          So, looking forward even one single generation, we will have dramatically fewer experts in the field than before, because so many fewer people were able to start in that field last generation. Since the need for programmers is greater every year, either these tools will be a wild success and meet all these business demands, or there will be a crisis of demand with no easy ways out.

          Since both of the foreseeable outcomes are detrimental to the workers themselves, what and who exactly are we rooting for? I think that most people, given the choice, would choose the existing cycle with a proven track record, rather than gamble on something so uncertain with no clear economic benefit to the workers themselves.