Hacker Newsnew | past | comments | ask | show | jobs | submit | gspr's commentslogin

> Training on copyleft licensed code is not a license violation. Any more than a person reading it is.

Some might hold that we've granted persons certain exemptions, on account of them being persons. We do not have to grant machines the same.

> In copyright terms, it's such an extreme transformative use that copyright no longer applies.

Has the model really performed an extreme transformation if it is able to produce the training data near-verbatim? Sure, it can also produce extremely transformed versions, but is that really relevant if it holds within it enough information for a (near-)verbatim reproduction?


>Has the model really performed an extreme transformation if it is able to produce the training data near-verbatim? Sure, it can also produce extremely transformed versions, but is that really relevant if it holds within it enough information for a (near-)verbatim reproduction?

I feel as though, from an information-theoretic standpoint, it can't be possible that an LLM (which is almost certainly <1 TB big) can contain any substantial verbatim portion of its training corpus, which includes audio, images, and videos.


> We do not have to grant machines the same.

No we don't have to, but so far we do, because that's the most legally consistent. If you want to change that, you're going to need to pass new laws that may wind up radically redefining intellectual property.

> Has the model really performed an extreme transformation if it is able to produce the training data near-verbatim?

Of course it has, if the transformation is extreme, as it appears to be here. If I memorize the lyrics to a bunch of love songs, and then write my own love song where every line is new, nobody's going to successfully sue me just because I can sing a bunch of other songs from memory.

Also, it's not even remotely clear that the LLM can produce the training data near-verbatim. Generally it can't, unless it's something that it's been trained on with high levels of repetition.


I want to briefly pick at this:

> you're going to need to pass new laws that may wind up radically redefining intellectual property

You're correct that this is one route to resolving the situation, but I think it's reasonable to lean more strongly into the original intent of intellectual property laws to defend creative works as a manner to sustain yourself that would draw a pretty clear distinction between human creativity and reuse and LLMs.


> into the original intent of intellectual property laws to defend creative works as a manner to sustain yourself

But you're missing the other half of copyright law, which is the original intent to promote the public good.

That's why fair use exists, for the public good. And that's why the main legal argument behind LLM training is fair use -- that the resulting product doesn't compete directly with the originals, and is in the public good.

In other words, if you write an autobiography, you're not losing significant sales because people are asking an LLM about your life.


> We'll need to figure out the techniques and strategies that let us merge AI code sight unseen.

Why do you assume that's doable? I'm not saying it's not, but it seems strange to just take for granted that it is.


Why do you assume I assume it's doable? :P

For real, I'm not certain we will ever be able to merge AI code without human review. But:

1. Every time I've confidently though "AI will never be able to do X" in the last year, I've later been proven wrong, so I'm a bit wary to assume that again without strong reasons.

2. I see blog posts by some of the most AI-forward people that seems to imply some people are already managing large codebases without human review of raw code. Maybe they're full of crap - there are certainly plenty of over-credulous bs artists in the AI space - but maybe they're not.

3. The returns on figuring this out are so incredibly high that, if it's possible, people will figure it out.

All that to say: it's far from certain, but my bias is that it is possible.


1. Every time I've confidently stated "this AI architecture will never be able to do X" in the past 6 years, I've not been proven wrong (with one possible exception earlier today: https://news.ycombinator.com/item?id=47291893 – the jury's still out on that one). … No, my version doesn't really work, does it? It just sounds like bragging, or maybe hubris.

> some people are already managing large codebases without human review of raw code.

2. I have never believed this to be impossible. I do, however, maintain that these codebases are necessarily some combination of useless, plagiarism, and bloated. I have yet to see a case where there isn't a smaller, cheaper way to accomplish the same task faster and better.

> The returns on figuring this out are so incredibly high

3. And yet, they still haven't figured it out. My bias is that it isn't possible, because nothing has fundamentally changed about the model architectures since I first skimmed a PDF about GPT, and imagined an informal limiting proof that I still haven't found any holes in.


> Why do you assume I assume it's doable? :P

Because you say we need to figure out techniques to do it. If it's not possible, then there are no techniques to do it. Since you want the techniques, I assume you assume that they exist.

> 1. Every time I've confidently though "AI will never be able to do X" in the last year, I've later been proven wrong, so I'm a bit wary to assume that again without strong reasons.

That's evidence that you shouldn't assume something is impossible. I'm not suggesting that, either.

> 2. I see blog posts by some of the most AI-forward people that seems to imply some people are already managing large codebases without human review of raw code. Maybe they're full of crap - there are certainly plenty of over-credulous bs artists in the AI space - but maybe they're not.

Do you have any idea whether this works well though?

> 3. The returns on figuring this out are so incredibly high that, if it's possible, people will figure it out.

Ok. But again, that's a big if there.

The returns on breaking a popular cryptographic algorithm are also huge, but that's not an indication that it's possible, or that it's impossible for that matter.

I'm baffled why people think that "it would be great if..." has any bearing on the chances that the thing that follows is true.


Your attempts at derailing the discourse is not only frustrating – in the case of climate change it might just kill us all. You're a danger.

I don't know what bothers me more, the guy trolling or you calling people "a danger" for posting literally a single question.

It's a bad faith question or one so deeply uninformed that parent is correct. It only takes a couple clicks to see the ideas of the people who are "just asking questions".

Chill. People need to cope, and humor, sarcasm, etc.. are ok

It is neither of those things. Their post history is pretty clear.

I swear “check their post history” has got to be weakest form of ad hominem going. “I can deflect from answering difficult questions if I attack the messenger” is just so weak.

You are mistaken, probably not for the first time today.


This is just another insufferable comment in your post history.

The dark future possibility here is that the big guy is allowed to launder the intellectual property of the little guy, but not vice versa.

That dark future is now, look at case law as applied to the AI operators vs the 'little guys'.

Even big copyright firms. Disney especially is known for rehashing existing material and then not allowing anyone else to do the same with their stuff. Disney does not have a lot of original stories.

> If “AI-rewriting” is accepted as a valid way to change licenses, it represents the end of Copyleft. Any developer could take a GPL-licensed project, feed it into an LLM with the prompt “Rewrite this in a different style,” and release it under MIT. The legal and ethical lines are still being drawn, and the chardet v7.0.0 case is one of the first real-world tests.

This isn't even limited to "the end of copyleft"; it's the end of all copyright! At least copyright protecting the little guy. If you have deep enough pockets to create LLMs, you can in this potential future use them to wash away anyone's copyright for any work. Why would the GPL be the only target? If it works for the GPL, it surely also works for your photographs, poetry – or hell even proprietary software?


> It is fairly ok for meetings

… when it works. And if you never have to change camera or microphone settings.

> and calendar integration.

The little notification that pops up telling you your meeting is about to start based on your calendar? The one you better not click in the first 5 or so seconds it's there, because then you'll end up with an error message that tells you absolutely nothing, have to go back to the chat, and try again?

No, it's not usable. For anything.


Providing it for convenience is fine. Having to accept the terms of conditions of Google or Apple as the only way is insane.

Don’t accept the terms then? What is even the point of this comment when you are provided with an alternative as you make your way through the form.

I feel like everyone I’m responding to just hates Apple/Google and is running GrapheneOS without play services so they hate public apps.


No, installing apps is just genuinely bad UX, at least for me, because my password manager doesn't work in apps, and I have to manually generate and copy and paste password. If creators of the app are extra stupid, they make it so that you cannot paste into the password input, so I have to enter the generated password character by character.

You use a broken password manager, and you think this is a problem with _apps_?

And why is it broken? Is there a way for a password manager app to somehow inspect other apps and identify forms within them and interact with the forms?

...yes? I can't tell if you're trolling at this point or genuinely unaware.

Both iOS and Android have APIs for this, you (as the app developer) just mark the relevant fields in the app as login/password/etc, and the OS will interact with your chosen password manager to autofill and/or save them.


Well then I don't know whom to blame, 1Password or app developers not marking the fields correctly.

If you've never seen this work on your device, then you might have something configured incorrectly — many app developers are incompetent and bad at this; but not _all_ of them.

It works for filling an existing password, but not for creating a new one, iOS still prompts me to fill existing even though I'm on the sign up page. On the web 1Password can also automatically generate a Fastmail masked email address, but I doubt there's any hope for that to work in a native app.

I've flown with many European budget carriers and have never once seen this requirement. Sure, they might charge for or not provide printed boarding passes, but they've always sent me a PDF or PNG boarding pass by e-mail or provided one through their website. That, in my book, is a non-issue. Forcing an app is a huge issue, and shouldn't be legal if the only reasonable way to get the app is agreeing to the draconian conditions of one of two gatekeeping companies subject to foreign jurisdictions.

This is an insane take. Apple and Google both reserve the right to deny accounts to people without any legal appeal at worst. At best, a legal dispute would have to be resolved in US courts. How other countries (including my own) accept that as a condition to use public services is beyond belief, and pointing this out is not an overreaction.

Why are we willingly placing private companies – private companies subject to foreign jurisdictions, even! – in the role of gatekeepers of public services? We have surely completely lost our minds!


You can literally just use the online form. Have you actually tried using your search engine to query applying, and seeing just how easy it is to optionally apply online?

Over half of the world’s population is using an Android or iOS device. Most people visiting a country in the UK or have the means to afford a trip, most likely have a functioning mobile phone.

I find it somewhat amusing you think I’m “insane” for suggesting most of the modern world has a relatively accessible Android or iOS device to apply for a visa.


> You can literally just use the online form. Have you actually tried using your search engine to query applying, and seeing just how easy it is to optionally apply online?

This whole story is about how they're trying to pressure you into using the app.

> Over half of the world’s population is using an Android or iOS device. Most people visiting a country like the UK or have the means to afford a trip, most likely have a functioning mobile phone.

That does not in any way affect any of what I wrote. I'll try to write it differently: Do you think it's OK that Google and Apple decide (at worst on their very own without oversight, at best with the oversight of a foreign country that isn't the one you're travelling to or from) who gets to do these things and under what conditions?

> I find it someone amusing you think I’m “insane” for suggesting most of the modern world has a relatively accessible android or iOS device to apply for a visa.

I find it insane that you think that because Google and Apple happen to grace most people with access to Android and iOS, then it's fine that we all live by their mercy.


I’m sorry but I can’t respond further, conversation is going nowhere. You clearly have it out for Apple and Google, and you are not being “pressured” into doing anything.

Critical reading and thinking would lead you through the flow to click on the “continue application online” form.

Here’s my workflow:

- Visit the main ETA site: https://www.gov.uk/eta/apply

- You scroll down just a teeny tiny bit until you see “Apply online”

- You select “Start now”

- Submit


I thought so too, but if you follow the 'start now' link you get a page full of trying to push you to use the app, then if you say you cannot all the way at the bottom, then you get another page trying to help you with installing the app, then you actually get the form. I'm quite disappointed, usually the UK government digital services are not quite so user-hostile.

There is also a new UK government requirement to verify your identity if you are a director or significant shareholder in a UK company.

The online route for that goes through a couple of pages then says "now switch to the app on your smartphone". In theory you can also go to a Post Office to get your documents checked but it didn't work for me.


It's insane that multi trillion government would rely on a foreign private entity for something so simple yet critical. The only sane answer here is corruption.

I actually don't think it's corruption. I think it's incompetence. But that might be even harder to overcome.

Good thing for them they're at it to have fun, not to please you. And good thing for you they're not charging you for it if you do wanna try.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: