Honestly it sounds like a nightmare even in the short term, companies are well known for literally poisoning everyone around them when no one else has a say in it...
The low regulation and trend of news articles from Texas... it doesn't inspire confidence.
> Do you think that Chinese labs will continue to release open models forever
Yes.
I think the Chinese government either already has, or will soon, grasp that if they train the models that people use they dictate what people believe (at least around the margins where that's malleable), and they will happily throw resources at that.
And simultaneously that the only way they can actually get everyone to use their models is if it's possible for us to run them on our own hardware.
This is going to age very poorly when the best Chinese labs ALREADY just started not open sourcing their models.
Qwen 3.7 is not open source; previous Qwen versions would have open source releases, but Qwen 3.7 plus does not. The second best Chinese model, Minimax M3, is testing the waters by taking longer and longer between “model release” and open sourcing it. This time, they spent 2 weeks after release before open sourcing it. There’s also a lot of rumors of GLM and Deepseek not open sourcing future models.
It’s pretty obvious that you cannot take Chinese models as open source for granted, they’ll be closed source soon.
If we're measuring progress in hours and days then yes. But if we're measuring progress in months then OSS models are doing fine. You can get a state-of-the art performance in an open model if you pretend it is January 2026 instead of June.
There is no evidence here that the cutting edge labs have any durable advantage. Extrapolating current trends it seems likely that even the Europeans will be capable of meeting any given performance measure with enough time. In fact the evidence suggests that the capital required to run the models is where a moat will develop. Knowing the weights won't help much.
Minimax M3 too, and huawei claims to be releasing non-nvidia dependent training software too. openPangu 2.0 could be a shake-up if it holds up as a good model
China may not care about open source, but they know they will personally fund AI through government investments while US relies on private investments, best way to scare private investments is a free capable alternative for everyone
Add on the fact that they actually invested in energy infrastructure and can offer AI very cheap to their citizens and you can get a population well versed in AI to reduce menial tasks and focus on more productive things (if we're to believe the claims of the technology)
The best chinese models are deepseek (general purpose) and glm (coding) and they are both open weight and share lots of their tooling.
There are lots of AI companies and it doesn’t seem that they all have the same funding fountain or share monetization goals. I wouldn’t read much into what each one of them is doing.
Even if the models by the Chinese labs are open source or open weights even after they get to mythos level intelligence lets say, still inference and the optimization of those models to be accessed at speeds of 1000 tokens/sec in not in the hands of general public as these models have parameters more than a trillion and they can't be run on some publicly available hardware, So even after being open source it does'nt fix the problem as the general public will still pay the company for inference.
I'm pretty sure these large models are run on Nvidia GPUs, not some unobtainable piece of secret kit. You could go down the street and buy from AMD or a number of other vendors to push out FLOPs if you wanted or needed, but you'll need a thick wallet to shell out for a cluster of GPUs to run these models. The reason people don't run the big Chinese models at home is that they can't afford the hardware, not that it isn't publicly available. This tech is essentially a large amount of matrix multiplications afterall.
I think the larger problem is that restricting US AI companies gives the Chinese a leg up because they now have a window open where they can become the source of the most powerful models available due to government restrictions rather than on technical merits. All Anthropic customers just got a downgrade last evening, for example. While the Chinese are able to serve the world or whoever, the US corporations will be limited to the US market, or whatever the powers that be will allow. This restrictiveness could turn out to be disadvantageous to American companies since people will migrate to wherever they can get the most powerful models.
> The best chinese models are deepseek (general purpose)
DeepSeek is developed by the largest Chinese hedge fund, their models used to make them $ on the share market are very profitable, they've never ever released anything on those models.
Somehow you are claiming that those same group of people are going to totally change their very consistent long term behaviour and start promoting openness when they are in the global leading position in AI?
The main reason the Chinese labs are releasing models as open weights is because they don't have the compute necessary to provide all of the inference. For the US frontier models something like 80-90% of the lifetime compute required for the model is inference rather than training. China wants to shepherd as much of their limited compute as possible towards training to keep up in the race.
I think the main reason is to minimize the market for closed-source models from US companies.
China knows that doing what Anthropic/OpenAI/Google/... are doing is impossible for them. No one outside of China in any sane condition will send their data to compute farms IN CHINA like people currently do with US-based frontier models. Even if they could muster the inference power.
Hence they do the second-best thing possible to attack the dominance of the US-based corporations: reduce their moat by open-sourcing models that are not fully equal, but practically useful and good enough for easily 90% of typical tasks people use agents for in their daily lives. But way cheaper to run.
As long as this arms race in AI continues, China as "number two" will have some incentive to continue open-sourcing models. But of course the US government might force a change if they continue to enforce limited public access to new frontier models - there is no market to minimize if a model is not allowed to be publicly available.
But at work the calculus is entirely different. There is already lots of exposure to US companies (guess where our emails and tickets life), so the increase in espionage risk from adding another American company is small. Not zero, and trust towards AI companies is limited. But adding the first Chinese company to send data to would be a major risk. One nobody would sign off on, given the general reputation of the Chinese economy for widespread espionage, disregard for copyright and producing copies of successful products using insider information
Not sure why anyone in the EU thinks the US is not a significant espionage risk. Adding any major US supplier would have been a significant espionage risk until really recently.
Before the EU cleaned up Europe's act pretty considerably on corruption, US companies used corporate but also state-level espionage actors to level the playing field against a culture of bribes and they were fairly open about it. They absolutely needed to do it, because of the potential penalties back home if they engaged in bribery abroad.
The tables have turned, now. The EU runs much more cleanly than decisionmaking in DC, which is clearly corrupted and lubricated with cash and opportunities for failsons and faildaughters; it has accelerated radically quite recently but it was heading that way from the first Bush era.
But I'd bet the corporate-state merger of industrial espionage is in full flow.
This would require active participation by people inside Anthropic and OpenAI. Given how generally ideological the people working in these companies are, I'd be willing to bet that we would already be reading Snowden-style leaks if it were true.
I have zero expectation that a similar culture exists inside Chinese companies. If you think these corporate and national cultures are the same, you need to adjust your priors.
> This would require active participation by people inside Anthropic and OpenAI.
Not necessarily of the companies themselves, though; just embedded people at the right hiring level.
> Given how generally ideological the people working in these companies are
History has many examples of truly surprising spies, over the long term. Including in highly ideological environments such as animal rights and eco-campaigning groups. The embedded police spying scandals in the UK make this clear.
It is naïve to think that there are no CIA or NSA employees in some functional role at these two businesses, just as it is naïve to think that they don't have intelligence industry contacts playing them because they are naïve. You only have to look at how the NSA weakened open cryptography to see that two companies staffed by young, absurdly rich people barely out of college with wobbly moral e/acc compasses might be getting played by homegrown spooks.
> I have zero expectation that a similar culture exists inside Chinese companies. If you think these corporate and national cultures are the same, you need to adjust your priors.
I suggested absolutely nothing of the sort — I flatly was not talking about China at all.
FWIW it cuts both ways: in the dim and distant past of the early dot-com era, I remember encountering someone who wafted inexplicably between US and UK multinational companies who I thought was possibly British intelligence. An odd duck for sure.
> given the general reputation of the Chinese economy for widespread espionage, disregard for copyright and producing copies of successful products using insider information
Quite funny because if you use that phrase verbatim except swapping China with the US it could actually fit.
Good governments try to do things that are in the interest of their population, and yes it could mean opposite interests to your/someone else governments.
No reason to blame US, Israel, China, Russia, etc. They just defend their piece of cake.
Anthropic and OpenAI are not just "another American company", their entire business (and industry) was created based on stealing data and using it for profit. You make this point about "another company" so casually that you'd think you added a SaaS bill for generating thumbnails or whatever. The exact same point you make about China can be made much more confidently and with stronger evidence for the entire modern LLM lab industry.
Again I have to echo the previous poster's point: Most people outside of the US really do not see the US as some much better alternative than China. If anything, in the specific area of LLMs, China are the ones doing work benefitting the everyman whereas almost everything the US labs do does not.
That's why I added "Not zero, and trust towards AI companies is limited". Reaching the decision that adding one single US-based LLM provider had more benefits than risks took months. And we were selective about who that would be (hint: not OpenAI). And I know companies who are not willing to go that step, using open-weight models on their own infra instead. But outsourcing inference to China was never even a serious suggestion. The notion is absurd to us
That said, I imagine e.g. South Americans thinking very differently on this front
China indeed has a general reputation for widespread espionage, so any Chinese company wanting to expand into the European market has to prove it isn't spying on its potential customers. US companies have traditionally been seen as friendly, so their platforms are essentially built around "trust me bro" guarantees.
In a world where both China and the US are now seen as hostile-by-default, this might actually leave some Chinese companies with an advantage in their ability to demonstrate trustworthiness.
The blurring of US state and corporate espionage in the EU is the stuff of legend. They have always spied, and you can easily make the case that in late 1980s/early 1990s Europe they had good reason to, because European businesses were corrupt.
Totally agree, though it is an unpopular opinion here.
It’s the same paradox as people claiming: “we are European, our data is safer in Europe” when actually your privacy is higher when your data is stored in China (or Russia) you are safer because it is out of reach from your local government.
The only thing I dislike, and that’s no matter the service, is that my data or information usage is shared with third-party.
For example, Anthropic conveniently forgets to mention Datadog has tons and tons of information about Claude users, or that your data transits through machines they don’t operate.
Safety has more than one definition. Being able to sue the company in small claims court when it threatens to delete your account is also part of that, and so is being able to pay for the service when Russian companies are once again put on a sanctions list.
China wants everyday people data because some of those people will get power one day, and China wants to be able to leverage knowledge of you, perhaps even "deep dark secret" data, if they need to.
Israel already does this through Epstein information from all the cameras and microphones that were listening and filming all the powerful people who visited the Island and the houses. They probably have a new Epstein already.
was going to say this.. open sourcing Chinese models will enforce Chinese dominance instead of reducing it. When an open Chinese model becomes the best alternative to inaccessible closed US models guess what everybody will start to use. And that same open model may embed certain narratives and values that please the Chinese government.
Ya. You know enough about China to know: would they be willing to sell users outside of China models that aren't fully pro-China (and won't deflect on tough questions)? Or would that be dirty money that they wouldn't want anyone to make?
Like if they could release Ch-ythos 6 tomorrow BUT it had Western ideals, would they take the fame, clout, attention, & profit, or stick to the party line?
(hope the monolithic brush is appropriate, considering, I mean it's an impressive system/country even if I have my own strong preferences - also I've taken as true reporting about their models deflecting etc. on sensitive topics)
The US administration restricting the use of US-trained models is one of the best gifts it could make to the Chinese LLM producers, and to the PRC government.
I won’t forgive Biden for not reversing more of trumps policies, especially immigration
Between RBJ refusing to step down, Biden not reversing immigration policy, and Biden refusing to step down in the primary until too late, he’s going to go down as a poor president in the history books - even if he wasn’t a bad dude or even bad in terms of policy.
Trump was also getting senile before they attempted to assassinate him. Hatred of his enemies gave him another 5 years of energy. Very frustrating, because he absolutly was doing word salad nonsense like this regularly before someone tried to shoot him:
"Look, having nuclear — my uncle was a great professor and scientist and engineer, Dr. John Trump at MIT; good genes, very good genes, OK, very smart, the Wharton School of Finance, very good, very smart — you know, if you’re a conservative Republican, if I were a liberal, if, like, OK, if I ran as a liberal Democrat, they would say I'm one of the smartest people anywhere in the world — it’s true! — but when you're a conservative Republican they try — oh, do they do a number — that’s why I always start off: Went to Wharton, was a good student, went there, went there, did this, built a fortune — you know I have to give my like credentials all the time, because we’re a little disadvantaged — but you look at the nuclear deal, the thing that really bothers me — it would have been so easy, and it’s not as important as these lives are — nuclear is so powerful; my uncle explained that to me many, many years ago, the power and that was 35 years ago; he would explain the power of what's going to happen and he was right, who would have thought? — but when you look at what's going on with the four prisoners — now it used to be three, now it’s four — but when it was three and even now, I would have said it's all in the messenger; fellas, and it is fellas because, you know, they don't, they haven’t figured that the women are smarter right now than the men, so, you know, it’s gonna take them about another 150 years — but the Persians are great negotiators, the Iranians are great negotiators, so, and they, they just killed, they just killed us, this is horrible." - Donald Trump, 2016
Technically his material support to a genocide makes him complicit, it would not have been nearly at the scale without US support tens of thousands of women and children were murdered as a direct result of his decisions[1], if international law meant anything we would hang him for that. So no, he was a "bad dude".
It's funny how the acceleration of the downfall of the US (due to trump) is a gift to everyone else. It's almost as if US didn't have as postitive impact on the world as they thought.
A gift to [every dictatorial regime]. It's not a gift to the common people. The hundreds of thousands of people who got aids, and wouldn't have if not for Trumps withdrawal, didn't benefit. The women of Afghanistan didn't benefit. The countries of the EU... Canada... Korea... Taiwan... Ukraine... really just about any democracy didn't benefit.
The downfall of the US benefiting bad people is not evidence that the US didn't have a positive impact.
There's also the Meta motivation, that even if you don't get the control you would like from releasing a model, it may still be worth it to at least deny others that control. I'm sure that matters even more to China vs. the US than it mattered to Facebook vs. Google.
> I think the Chinese government either already has, or will soon, grasp that if they train the models that people use they dictate what people believe (at least around the margins where that's malleable), and they will happily throw resources at that.
that doesn't require the model to be SOTA, it can be just a compact model capable of running on some inexpensive hardware. that is vastly different from SOTA models like Mythos which can potentially disrupt lots of things.
Of course it requires SOTA, people will always choose better models over some compact thing that is obviously more limited. You can't control the truth with models nobody wants to use.
People choose SOTA right now because of the heavily subsidised model subscriptions. People aren't going to pay 20x the price for a model that's maybe 10% better.
Because you communicate with it using natural language and real-world references and descriptions of what you want, you use emotion and emphasis (especially when re-prompting), you use examples and illustrative stories and common expressions. Understanding and interpreting all of that and replying in kind, to some degree, requires a large body of non-computation, cultural knowledge, or else the prompts are just meaningless words, and the replies will look like compiler output.
That sounds intuitively true, but I’m not convinced that it is actually the case. I don’t think we know enough about neural network training to say what training and how many parameters are necessary for what kind of performance on which tasks. To me it looks like we currently guess that more is better and try to throw as much compute and data at the problem as is economically feasible. There is little incentive for companies to invest into small model research since their moat is huge models that require special hardware to run.
> The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees
There's no way they have the authority to actually order this and not just request this right? If crypto is speech... LLMs definitely are...
They do have the authority to do this, Anthropic has the ability to appeal it in court, up to the SCOTUS. Lord only knows what our crazy ass judges in that court will do though.
The US tried to ban it. djb challenged it on first amendment grounds and the result was that the US government gave up trying to enforce any ban.
AI is different though because these models are private, so they cannot really be considered to be "speech". Although if it were an open model it would likely be protected speech to release it.
You have to squint to see the output of an LLM to be speech. The input is clearly speech but the government is not preventing anyone from writing or publishing prompts, only from running those prompts through the model.
In the case of the crypto export ban, the government was attempting to suppress the release of cryptographic research. For example, if a cryptographic researcher wrote a paper on a cipher and they included a definition of that cipher in the paper, that was an "export" of cryptography. This is very clearly a restraint on speech that violates the first amendment and after much legal wrangling the government agreed and the issue evaporated.
Doesn’t really matter - the government is given wide latitude by the judiciary in matters of national security. I also expect Anthropic will fight this in court if it lasts very long.
They're promising to start selling a Qiyuan A06 variant with Sodium batteries sometime this year... so if you went looking you could probably see one... or will be able to soon.
> Rubio/newsome/shapiro etc will all keep the full pressure of all allies in them, potentially kicking them out of places they already sell.
I sincerely doubt the US is capable of this. Trump has lit your soft power on fire. Trying to get people to give up a superior and cheaper product is an extremely large ask.
The leaders there know that China isn't exactly a friend of liberal western democracy. They have won the current round of propoganda, but that doesn't mean they are anyone's friend either.
No, they have won the current round of foreign relations. Threats to invade numerous allies. Blatant war crimes like murdering random people on boats. Violating established and signed trade deals left right and centre. Openly soliciting and accepting bribes. Kidnapping foreign countries citizens and holding them in inhumane camps. None of this is a matter of "propaganda" - it's a matter of actual actions the US is taking.
> The leaders there know that China isn't exactly a friend of liberal western democracy.
Indeed, but this has never been a prerequisite for trade with liberal western democracies. See for example the gulf monarchies we trade with.
It is pretty much a prerequisite for extraordinary actions like successfully asking liberal western democracies to restrict trade though, and the US no longer meets it...
> Which is to say Trump failed at the current round of propaganda.
No, it really isn't. "Propaganda" merely refers to communication intended to influence. Trump failed when it came down to actual actions, not just communication. And when he failed in communication it was actual diplomacy meant to come to agreements, not merely the words meant to influence minds.
Propaganda is the least of the USes problems right now.
It is less about the us being capable of it, than the us getting out of the way. Japan, India, and SK all have vested interests in preventing further concentration of Chinese mercantilist power. Saying an establishment us president would focus the fury and might of allies is a bit outdated I agree. SK survived a coup, Piland is working on it. Hungary might even pull it off. Maybe the us will right the ship as well vs overcorrect into a different sort of populist autocrat. But even then as you say; That soft power went up in flames.
The soft power that people talk about yielded instantly when used. Trump’s foreign policy has been fairly scatter shot and foolish but it has only revealed that soft power is only soft. When you attempt to exercise it you find nothing there.
The other world powers are exercising their will directly through power as power: no amount of Hollywood or America Is The Good Guy belief ever bought America a trade deal or sanction power.
The only power that America has is her Navy and the nuclear weapons under the seas. Power that cannot be summoned is not power. The illusion that it is suited American allies and her wider array of beneficiaries because it allowed them to call upon the world hegemon for aid. But America is not that sole superpower anymore so it is useful to her to know the illusion for what it is: an illusion.
Yeah... that's just not the case. The US routinely successfully exercised its soft power prior to Trump 2.0. For example it's why this news article even exists - the US (under Biden) exercised its soft power to get Canada to effectively ban Chinese EVs - otherwise they would already be here.
The news article is the news article but the reality is Canada operated under the threat of tariffs and now they have unconditional tariffs. Threatening someone with something lets you extract concessions. Using the threat removes the ability and makes it just math. China’s tariffs are more damaging than the US’s and are releasable so Canada makes deals with them.
It’s more a story of hard power than soft power since economic damage ultimately led the way.
> It's not actually passing every single test, though that is on purpose. I did mark some parts of the testing suite as "skipped" because I don't think it's worth recreating them in a library like this - email related stuff, i18n, perforce/svn importers, some of the midx/bitmap stuff - things of that nature. However, for everything that I'm sure is relevant to nearly anyone reading this, the Grit library/CLI can now fully pass the Git test suite.
It'd seem weird to plan to use this until the readme stops saying
> it has been nearly entirely written by agents and has not been used for realsies. It's probably currently unusably slow or completely broken in ways that are not exercised in the test suite.
Right now it's someone else's experiment that is still in the "might or might not pan out" stage.
There are a bunch of projects using the similar (not vibe coded, less fully featured) gitoxide project - there is demand for git-as-a-library.
I would not use this except to help us test it if interested. I'm announcing it because it's interesting and a milestone in the breadth of test coverage it can pass. It almost certainly cheated on a bunch of those tests and is not feature complete yet.
The author of gitoxide is also working on GitButler (who worked on this project) and we're pushing both projects forward and actively using and developing Gitoxide as well. This is simply a different and hopefully complimentary approach to the same problem.
> because it's interesting and a milestone in the breadth of test coverage it can pass.
Sorry, no. Let me be candid and point out that this has achieved exactly nothing except lighting $8k on fire.
Put it this way: if I suggest to my boss, "I want to spend $8k of company money to port git to Rust to just see how many tests can pass in that project, even though I don't plan to develop new features with the project, and I don't care about adoption", he is going to shot down the idea in half a second and seriously question my competence.
I don't think it's that clear cut. The functional parts probably aren't copyrightable, only the stylistic ones. It's going to be a mix of courts applying laws in new ways that hasn't been done before and fact specific questions about what actually persisted through the LLM if it goes to court.
I'd be fascinated to see what happens if it does. Both in the analyses that we'd get of what the LLM did to the codebase and on the legal decisions on what the copyrightable creative elements in code actually are.
If I was the author though... there would be no way that I would be volunteering to be a test case like this. Also seems just rude for no reason.
It probably would have been less bad if he had chosen MPL-2.0 or LGPL-2.1-or-later. But he chose MIT, which cuts at the core of the intent of licensing the project with a share-alike license.
Tell me, can I create a copyrighted video that's not GPL licensed using ffmpeg?
Now tell me how creating a rust library using the git test suite is different?
But for the sake of argument: The test suite itself is copyrighted. To the extent the resulting work is a derivative of the test suite it is possibly infringing. For example you might example that the agent would derive variable names, function names, structure sequence and organization of the code from the test suite. It might even copy comments wholesale. Those are copyrightable things. (Which is of course just the first step in analyzing if it is infringement, there would be interesting fair use, de-minimis copying, etc arguments following a conclusion that any of those were copyrighted. A product produced this way definitely could be infringing given the right facts though).
yeah fair - the "The canonical Git source code we're targeting to replicate the functionality of is in the git/ subdirectory." part makes this hard to argue against.
> To the extent the resulting work is a derivative of the test suite it is possibly infringing
It's this bit that I have a problem with. If I run the test, it fails and reports a failure. Now I write code and run the test again. What is the theory there that code that I wrote infringes.
Doesn't infringe upon copyright period, because there's no creative element in that work.
Imagine a more substantial example though. Perhaps you have a test that checks that some file written in a binary format is correct, and gives names (creative elements) to each field of the format that it prints when you mess up the field, and has comments describing why the bytes are laid out like they are (the comments being copyrightable even if the facts they describe aren't), and the LLM copies those field names and comments verbatim... Now it's quite likely that the LLMs work is a derivative of the test suite.
For that assertion in particular I believe I'm practically parroting a ruling by the district court in Oracle vs Google about some extremely simple Java functions that Oracle claimed Google copied. Though I can't say I checked to make sure I'm remembering right.
You're recalling it right, but there's a nice quote from Judge Alsup in that case that talks about this exact situation:
> “So long as the specific code used to implement a method is different, anyone is free under the Copyright Act to write his or her own code to carry out exactly the same function or specification...”
Here given that this is rust and the original expression is C, the implementations cannot be the same by definition.
I'd challenge you here to think about this in terms of the legal aspects rather than reaching specifically for similarities as similar is often meaningless in the law or contracts when specific acts are codified rather than generalized ones.
I'd say what we're talking about here is probably a fair bit different to modding a game in most aspects.
I haven't followed any relevant cases but I would be surprised if there's any serious dispute that the common methods of modding games generally create derivative works. I think the dispute would be downstream of that as to whether or not the mods are covered by fair use.
If you did it in a loop until the test passed, maybe?
Your result is essentially impossible without the original. With ffmpeg, your result does not depend on ffmpeg specifically - you can use any video creation tool.
Repetition isn't really a factor in deciding whether something is infringing or not - check the copyright law in your jurisdiction. Here if you look specifically at what an LLM's sampling stage is doing, it's choosing non infringing tokens (i.e. rust source code) over infringing ones (i.e. C source code). So it's making an intentional choice to do something similar rather than creating something that has the same expression. That doesn't seem like it's copyright infringement to me.
A GPL tool that processes data doesn't virally transfer the license to its output. Copyrighted ffmpeg code isn't incorporated into the video output. The LLM didn't just conjure up equivalent behavior to git without ingesting the code and transforming it as new output. There is no other behavioral description that would reproduce all needed functionality.
It would be reasonable to point an LLM at these and use them with a basic knowledge of git to produce a rust version of git in a non-infringing manner.
If you did this manually it would take a long time.
Substitutibility probably doesn't apply here in the way you're implying and if it did it would likely be hampered by the 9th circuits findings about transformation in sony v connectix. Arguments here likely would look at rust not having a stable ABI, and hence not being inherently substitutable as a libray (grit-lib), less clear as an executable (grit-cli) on that side
basics of copyright law - the fundamental thing being protected is the expression... is a rust program's expression the same expression as a c program? I'd say generally not.
The test suite could test aspects of the architecture/design of the codebase that are not necessary for interoperability and constitute novel expression of a piece of software in a way that is not at all language specific.
By definition a test suite is about testing interoperability with the test suite. An HTTP test suite should likely test for whether response code 418 is implemented a particular way and while humorous it would still be an interop test no?
Because compilers and LLMs do different things, and what is done matters, so you can't reason by stepping from one to the other.
Compilers don't axiomatically yield derivative works, they simply in practice do because for non-trivial programs they preserve copyrightable elements of the work in the output.
Well compilers are a mechanical transformation and if that were sufficient to free you of IP law then IP law wouldn't work.
An LLM is also a computer program which takes input and produces output related in some way to that input. However I don't think most people would view it as a "mere" mechanical transformation. One could tautologically argue that an LLM blends the user input with the training inputs which is a sort of transformation and further that the LLM itself is a computer program thus it is mechanical in nature. However it should be immediately obvious that such an overly literal interpretation is in danger of subsuming human work as well. Where the boundary lies is an unanswered question.
Related, compilers can pose a problem depending on what the output includes. For example common lisp compilers that aren't under a permissive license are a minefield because regardless of what anyone might say the image that gets output includes (approximately) the full language implementation verbatim in addition to the user's program.
functional parts not being copyrightable means that you can't claim a program is a copyright violation based on the fact it does the exact same thing based on compatibility reasons (you can copy what the program does). E.g. git stores refs in .git/refs, so does grit, that's not a violation. You still can't copy the program.
Yes... and now we get to the fact specific question of "did they copy the program". Or actually the answer to that is plainly "no" - they made something similar from it - and didn't run ctrl-c ctrl-v in an unlicensed manner, but "did they copy the relevant facets of the program into the new similar thing".
No. You're allowed to make a similar tool, the functional elements are not copyrightable. There's a long history, predating LLMs by many decades, of doing this in the software industry.
My use of the word "similar" does not imply here that I think it's obvious that they are "similar" in any copyrightable elements - whether they are or not is one of the interesting questions I think this case would have to resolve.
Incidentally you're also allowed to make similar creative elements so long as they aren't copies and you did so independently... which could actually come up in a case like this (imagine the LLM produced a similar function to some function in the original... but the original wasn't in the context window at the time. Not at all unlikely with code where there often is only one or two natural ways to write something).
The low regulation and trend of news articles from Texas... it doesn't inspire confidence.
https://news.ycombinator.com/item?id=48249747 / https://reclaimthenet.org/texas-woman-arrested-for-facebook-...
https://news.ycombinator.com/item?id=48198551 / https://www.autonocion.com/us/tesla-lithium-refinery-texas/
https://news.ycombinator.com/item?id=44121178 / https://www.texastribune.org/2025/05/28/texas-fracking-water...
reply