pleroma.debian.social

pleroma.debian.social

So. "AI fun", in a different way. Within I am part of the FTPMaster team. We are multiple (as you can see here: https://www.debian.org/intro/organization#ftpmaster ). One part of what we do is writing dak, the software that keeps the Debian archive running. And that recently got a commit authored with the help of AI.

I *dislike* AI. I took a nights sleep and thought about it, and today spoke up in our IRC channel, asking to not use AI for work on dak.

Well, obviously, we now have one in favour and one against AI usage. I *currently* see no useful way out. There may be one, but what is a middle ground between AI usage and no AI usage?

And no, I don't want people running and yelling at other team members now. That won't help and isn't my intention. So please don't, it won't help anything.

If you do have a suggestion on what I may miss, then please, speak up. But a plain "throw them out" is as unreal as "be happy and kiss AI".

@Ganneff if it's enough code to qualify as a copyrightable change, then authorship becomes an issue that moves the discussion from opinions to project policy.

@Ganneff It seems to me that because of the known dangers of AI code generation, the only way to completely trust such code is to review it at such a detailed level that one may as well have just written it manually. So there isn't any advantage unless you don't care whether it actually works in all corner cases (the usual vibe coding scenario). If the code is so short and obvious that reviewing it manually is trivial, then again why bother with AI in the first place?

@rfr The project? You mean ? Have you monitored the list whenever someone talks about AI work? There is no project policy.

@Ganneff having not read the IRC chats.. Coordinated reviews? What is it about a submission that deviates from a purist (non-AI) submit? Obviously if its nonsensical garbage or a blatant abuse of reviewers time, hard no.
If its "plausible" and passes review and tests, label it as code submission if merged?

@Ganneff no, I mean dak.

@kbm0 Well, the one commit we currently have is (halfway) easy to review, but quite some brain (and for humans: time) to get to. So yes, I can somewhat see an advantage there.

@rfr Ah. Well. And there we are back to square one: We are currently discussing it internally. With one in favor, one against. (A git summary may tell why there isn't really much of a third voice of input).

@sternecker You can't read the chats, they are in a private channel.
The deviation is plain AI usage. Its not garbage (quite the opposite). It does not steal reviewers time (that's not how we work). The one using it does review it on their own. And ensures the quality is fine.
(And if they wouldn't have said they used AI, noone would know its AI behind).

@Ganneff I've not yet come across such a situation myself. Usually I find code is much easier to write than it is to review completely, to the point of thorough static analysis. But when reviewing code written by another trusted developer, one usually stops short of checking *everything*. I'd suggest you cannot give that benefit of doubt to an AI contribution. Assume that the contributor is a black hat and is deliberately trying to get a subtle bug or vulnerability past you.

@kbm0 If they want to play havoc -- there are way easier routes for that. Both, they and me, have full commit access without the need of reviews. Also, we have full access on the prod machines (test? Whats test?).

So, I'm not worried they want to somehow commit bad things. Trust is entirely there, and for more than a decade. There are much easier ways for each of us if we would want to do something bad than trying to hide it in a public visible commit.

@Ganneff thanks for the info!

@Ganneff I am not suggesting you should distrust the human developer. But the AI generated code itself cannot be relied upon in the same way, and should be presumed malicious. My argument is that it is always easier to write the code yourself than it is to verify it properly.

@Ganneff Personally I watched it from the sidelines, and while I am politically very much against AI usage in the project, here it could be rebuffed with the copyright:

There's no way that LLM generated code can be copyrightable when there is no free LLM with its training data available
I read which model they used, and that certainly doesn't fulfill the DFSG. If we do not care about that then we might as well ship proprietary firmware in main.
replies
0
announces
0
likes
3

As the problem itself is currently unsolvable (it are two fundamentally different positions, after all), we are currently down to "Any commit that includes AI generated content is marked with an tag in the commit message".

That doesn't make it go away, for sure, but it at least leaves a trace where to look for this stuff, when we have a need to.