Wow! And it also implements a very interesting variant of SUBLEQ that is turing complete.
>This VM implements an OISC - a One Instruction Set Computer. That instruction takes three signed 32-bit operands, a, b and c, and runs a program from memory m[] as follows:
1 PC (program counter) starts at 0
2 Fetch the next instruction (32-bit signed operands a, b and c)
3 If the low bit on any operand is set, remove it, and replace that operand with m[operand] i.e., a dereference of that address
4 Set m[b] = m[b] - m[a]
5 If m[b] is 0 or negative, set the PC to c, otherwise increment PC by 3 words
There's always someone somewhere who, with hindsight, did something that could be retconned into being similar to something important we've got today, von Däniken being an extreme example. Not putting down Losev's work, but accidentally stumbling on an interesting physical effect that you treat as a curiosity and engaging in targeted research to turn in into a product is a very different thing. For example the FET was envisaged multiple times in the same time frame as Losev's work, but wasn't rigorously pursued until Bardeen et al came along.
> He used these junctions to build solid-state versions of amplifiers, oscillators, and TRF and regenerative radio receivers, at frequencies up to 5 MHz, 25 years before the transistor. He even built a superheterodyne receiver.
That one calls them "negative resistance diodes" but I don't see how you can make a functional solid state amplifier and the like without it being a transistor.
The USSR famously invented everything the west did but years or even decades earlier, only for some reason never commercialised any of it, to the point where it became a bit of a running joke like the Su-24 "validating" the design of the F-111 which preceded it by some years. So I'd take any claims like this with a bit of a grain of salt.
Yeah, that pattern can be seen everywhere in semiconductors. E.g. the transistor invention vs. Lilienfeld, Heil, Matare etc. So the scope is more narrow than "Inventend Semiconductors".
Generally, there seems to be a tendency to disregard discoveries from outside the US. I think this pattern can still be observed today...
Other examples: Invention of light bulb, telephone.
What do you mean with "open-source"? Of course, the inference code for all the open weight models is publically available - see llama.cpp or hf transformers.
There are, however, very few models where also the full training pipeline is available. Olmo by AI2 comes to mind.
Ahah I was just thinking about that tiny web server the other day and even submitted it here, but it didn't get any traction. Back then (and even now) I thought it was very impressive!
In theory I would expect them to be able to ingest the corpus of the new yorker and turn it into a template with sub-templates, and then be able to rehydrate those templates.
The harder part seems to be synthesizing new connection from two adjacent ideas. They like to take x and y and create x+y instead of x+y+z.
Most of the good major models are already very capable of changing their writing style.
Just give them the right writing prompt. "You are a writer for the Economist, you need to write in the house style, following the house style rules, writing for print, with no emoji .." etc etc.
The large models have already ingested plenty of New Yorker, NYT, The Times, FT, The Economist etc articles, you just need to get them away from their system prompt quirks.
I think that should be true, but doesn't hold up in practice.
I work with a good editor from a respected political outlet. I've tried hard to get current models to match his style: filling the context with previous stories, classic style guides and endless references to Strunk & White. The LLM always ends up writing something filtered through tropes, so I inevitably have to edit quite heavily, before my editor takes another pass.
It feels like LLMs have a layperson's view of writing and editing. They believe it's about tweaking sentence structure or switching in a synonym, rather than thinking hard about what you want to say, and what is worth saying.
I also don't think LLMs' writing capabilities have improved much over the last year or so, whereas coding has come on leaps and bounds. Given that good writing is a matter of taste which is beyond the direct expertise of most AI researchers (unlike coding), I doubt they'll improve much in the near future.
You're ignoring what I said. They work better when turning it into a two step process. Step 1 create a template. Step 2 execute the template.
>The large models have already ingested plenty of New Yorker, NYT, The Times, FT, The Economist etc articles
And that ends up diluting them. Going back and doing another pass on only a subset would give them stronger voice. At some threshold, scanning information brings it to average and a return to the mean, instead of increasing the information. It's a giant table of word associations, it can regress.
Someone here mentioned a whole ago that the labs deliberately haven't tried to train these characteristics out of their models, because leaving them in makes it easier to identify, and therefore exclude, LLM-generated text from their training corpus.
But it's odd that these characteristics are the same across models from different labs. I find it hard to believe that researchers across competing companies are coordinating on something like that.
>This VM implements an OISC - a One Instruction Set Computer. That instruction takes three signed 32-bit operands, a, b and c, and runs a program from memory m[] as follows:
1 PC (program counter) starts at 0
2 Fetch the next instruction (32-bit signed operands a, b and c)
3 If the low bit on any operand is set, remove it, and replace that operand with m[operand] i.e., a dereference of that address
4 Set m[b] = m[b] - m[a]
5 If m[b] is 0 or negative, set the PC to c, otherwise increment PC by 3 words
6 Go to step 2
reply