Hacker Newsnew | past | comments | ask | show | jobs | submit | brucehoult's commentslogin

Trivial with a 10c microcontroller ...

"Trivial" But then you realize you forgot to account for a 32 bit counter wrapping up. Or potential failures in the power supply or other capacitors

That 10c microcontroller has 15 32 bit registers, allowing you to make up to a 480 bit counter. That ought to be enough until long after the heat death of the universe.

It also has 2k (16384 bits) of SRAM, allowing even larger counters.

It runs off 2.8V - 5.5V DC, so supplying power is pretty trivial. Doesn't need a crystal, though of course adding one will improve the timing accuracy.


somewhere else they were discussing how to use a 555 to time 55 years, and how for such a long period you'd need impractical resistance and capacitance values. easy workaround would be to set a more reasonable period, say, 1 sec, and use a counter to know when you hit 55 years. coincidentally, 55 years is 2 ** 30.7 seconds, so it'd just fit in a 32 bit register.

though i take you were thinking about counting clock cycles or something in which case surely your register would overflow


The size of a register is not the largest value you can conveniently count on a computer. You can use multiple registers.

Old computers often had a "carry flag" specifically to make this easier e.g.on Arm:

    add r0,r0,#1
    adc r1,r1,#0
But even on RISC-V, often criticised for not having a carry flag, it's not hard:

    addi  a0,a0,1
    sltiu t1,a0,1 # set to 1 if a0 wrapped back to 0
    add   a1,a1,t1

Easier with a calendar reminder

I find it much easier to write a ten line program for an 8 pin CH32V003 (or ATTiny85 in past times) to do exactly the timing or SDC comparisons I want than to figure out the circuit and component values for a 555 or op-amp.

For that matter, a 16 pin CH32V003 can emulate a vast array of 7400 series devices as long as you don't need ns timing — no problem for µs. It's also cheaper.


Using a cpu running software to emulate a handful of gates is just the furthest thing from interesting. It's the inverse of elegant.

Until you go to lay out your circuit board. There's a reason microcontrollers are used for tasks like debouncing switches.

I said uninteresting and inelegant. No one disputes that brute force is functional.

"There's a reason microcontrollers are used for tasks like debouncing switches."

Because people are too cheap (or fail that hard at basic analog electronic control) to get a proper single-pole single-throw switch with a pair of MOSFETs in a monostable mode, or use an S-R flip-flop latch to debounce, or even a very simple R-C filter circuit.

"Throw a microcontroller on it and call it a day" is the surest sign of someone not properly educated in electronic engineering.


I think it's like living under a waterfall.

If you live under a waterfall you'll use 1000 gallons of fresh water pumped at blasting high speed to wash a cup.

We live under a waterfall of cpus and gates in general, and organisms don't care if their environment is perverse. A thoughtless organism will happily consume 1000 units of a free resource just to get 1 unit of some other non-free resource.

And a lot of humans are the worst. Thinking beings who elect not to care about anything like that. Like spammers that operate simply because sending email is free for the sender. They get almost nothing from it, and it costs everyone else a lot, but it costs them even less than the tiny bit they gain, and the external costs don't matter to them the tiniest bit.

But the environment is perverse, created by economies of scale and Asian slave labor and the push for advancement for it's own sake which makes existing useful things artificially low value by being "obsolete".

A software version of that might be making apps with Electron. It doesn't matter how much cpu and ram and disk and general mass of tech stack it takes to make some trivial app. The developers precious time outweighs all other considerations. If they can make the app in a few minutes with no effort instead of a few hours, it doesn't matter how much of everyone else's resources they consume since their time is valuable and 1M other people's cpus are free.


All of that stuff is more expensive and uses more board space.

Which is why my bluetooth keyboard/mouse controller pad has that (specifically resistor/cap circuit under each key) riiiiiiiiiiiiiiight?

Not THAT much of watershed. I didn't even see a lot of stuff about the equivalent first ARMv9 SBCs (and first with SVE almost TEN YEARS after the spec was published), the Radxa Orion O6 a year ago and Orange Pi 6 Plus half a year ago (same chip).

Also, none of us actually HAVE them yet. Sure, I've been using a pre-production board at SpacemiT via ssh to China since mid January, but it's still probably two weeks until I'll have one in front of me and I can browse the web and watch YouTube on it etc.

All the things we could do via ssh were published three months ago. LivingLinux for example has a whole series of videos on YouTube.

https://www.youtube.com/playlist?list=PLYxFtt1xWrthuSGclxIsw...

There's plenty of coverage over on r/riscv and r/spacemit_riscv


> rlwimi / rlwinm

Definitely a nice and pretty much pioneering feature on PowerPC in 1994 (and I guess RS/6000 before that, but I never used one).

Today's Arm64 BFM does both those jobs in one, minus the ability to create a split mask via rotating, but plus adding a choice of sign or zero extension to extracted fields (including extracted to the same place they already were, for pure sign/zero extension). As a result it's got about 100 aliases.

It would be nice to have these in RISC-V but they seriously violate the quite strict "Stanford Standard RISC" 2R1W principle that keeps the RISC-V integer pipeline simple (smaller, faster, cheaper).

When working in the "B" extension working group I suggested adopting the M88000 bitfield instructions which follow the 2R1W principle. Someone had an objection to encoding both field width and offset into a single constant (or `Rs2`), though I think it's well worth it. M88k as a 32 bit ISA used 5 bits for each, but 6 bits for each for RV64 fits RISC-V's 12 bit immediates perfectly.

- ext / extu: Extract signed or unsigned bit field from a register. You specify offset (starting bit position) and width. The extracted field is right-justified (shifted to the low bits) in the destination, with sign-extension or zero-extension.

- mak: Make (insert) a bit field. Takes a value, shifts it left by the offset, and inserts it into the destination while clearing the target field first (or combining in specific ways).

- set: Set (force to 1) a contiguous bit field in a register.

- clr: Clear (force to 0) a contiguous bit field in a register.

All take `Rd`, `Rs1` and a field size:offset as either a literal or as `Rs2`.

Unfortunately, the R-type `mak` violates 2R1W because the `Rd` is also a source, which complicates OoO implementations making them 3R1W. RISC-V could use an alternative formulation in which `mak` (or some other name` masks off the source field and shifts it into place, and then the insert is completed using `clr` and `or`.

On the other hand the forms with 12 bit literals are expensive in encoding space, but even including just the `Rs2` versions would be great, especially as often several instructions in a row can use the same field specification, which fits `addi Rd,zero,imm12` (aka `li`) perfectly.

On the gripping hand, while the immediate version of `mak` violates RISC-V convention by making the `Rd` also a source, any real pipeline is going to have fields for all of `Rd`, `Rs1`, `Rs2`, and `imm32` so only the decoder is affected.

Also, `ext` / `extu` are not needed as a pair of C-extension shifts do the same job with the same code size, and can be decoded into a single µop on a higher end CPU if desired.

As an example: take a 10 bit field at offset 21 and insert into a destination at offset 1 (this is part of decoding RISC-V J/JAL instructions).

PowerPC:

    rlwimi  r4, r3, 11, 1, 10
Arm64:

    ubfx   x2, x0, #21, #10      # extract bits[30:21] → low 10 bits of x2 (unsigned)
    bfi    x1, x2, #1, #10       # insert those 10 bits into x1 starting at bit 1
Alternatively, using `bfm` directly without aliases (exactly the same instructions, just trickier to get right)

    bfm    x2, x0, #21, #30
    bfm    x1, x2, #63-1, #9

M88k:

    extu   r3, r1, 21, 10        # extract 10-bit field starting at bit 21 → low bits of r3
    mak    r2, r3, 1, 10         # make/insert the field at bit 1 in destination
RISC-V:

    srli   x12, x10, 21          # shift field down to low bits
    andi   x12, x12, 0x3FF       # mask to 10 bits
    slli   x12, x12, 1           # position at bit 1 (for imm[10:1])
    li     x13, ~0x7FE           # mask to clear bits [10:1] only
    and    x11, x11, x13
    or     x11, x11, x12         # insert the field
RISC-V with some M88k inspiration:

    extui  r3, r1, 21, 10        # extract 10-bit field starting at bit 21 → low bits of r3
    maki   r4, r3, 1, 10         # modified mak: masks + shifts field to bits [10:1] (others 0)
    clri   r2, 1, 10             # clear the target field in destination
    or     r2, r2, r4            # insert the prepared field
Alternatively

    li     t0, (1<<6) | 10       # specification for insertion bit field
    srli   a3, a1, 21            # shift 10-bit field starting at bit 21 → low bits of r3
    mak    a4, a3, t0            # modified mak: masks + shifts field to bits [10:1] (others 0)
    clr    a2, t0                # clear the target field in destination
    or     a2, a2, r4            # insert the prepared field
Alternatively:

    srli   a3, a1, 21
    maki   a2, a3, (1<<6) | 10   # decoder expands to `maki a2, a2, a3, (1<<6) | 10`
Again, this last formulation of `maki` violates RISC-V instruction format convention in making `a2` both src and dst, BUT if the decoder handles that then the expanded form does NOT cause any issues with the pipeline implementation.


bitfield insert/extract was also looked at by the scalar efficiency SIG: https://lists.riscv.org/g/sig-scalar-efficiency/topic/115060...

IIRC it didn't go anywere, because it wasn't worth the encoding space.

But a rlwimi sounds like a good candidate for >32b encoding.


Both the PowerPC and Arm64 instructions do grab a lot of encoding space.

rlwimi uses 26 bits of opcode space (i.e. 2^26 = 64M code points). In a RISC-V context you can drop the Rc (set status flags) bit, but for RV64 you need to expand the shift/start/end fields from 5 to 6 bits, so you end up needing 28 bits of encoding space, 18 for the field spec and 5 each for Rd1 and Rd/Rs2.

A RISC-V major opcode, such as OP-IMM (which this effectively is, but with a R/W Rd/Rs2) only has 2^25 bits of encoding space for all instructions in total!

PPC64's rldimi expands shift and size to 6 bits each but drops the ability to take the source field from an arbitrary position but only from the LSBs, and so uses 23 encoding bits. i.e. exactly my proposed RISC-V instruction (except for the set flags bit, so 22 bits).

Arm64's BFM/SBFM effectively uses 24 bits to provide both 32 bit and 64 bit operations — there are 25 bits but `sf` and `N` must be the same, potentially allowing the other half of the code points (plus the ones for 32 bit with the MSBs of `immr` and `imms` set) to be used for something else in future. Note that BFM leaves all other bits in the dst unchanged, while SBFM both sign-extends into the higher bits of dst AND zeros the lower bits of DST.

So BFM/SBFM *could* be fit into RISC-V, taking up half of a major opcode, of which there aren't many left. That is a pretty huge amount — the enormous V extension takes 1 1/2 major opcodes, for far more functionality. It would free up various immediate shifts and sign/zero extension instructions, but those don't take much encoding space, no more than 16 bits each.

As nice as they are, it's hard to avoid a conclusion that both (32 bit) PowerPC and Arm64 spend too much opcode space on these.

I think PPC64's `rldimi` and M88K's `mak` (extended to 64 bits) and my last RISC-V suggestion — which are all effectively the same thing — hit the right tradeoff, not using excessive encoding space but allowing a 2-instruction sequence for that bit field move):

    srli   a3, a1, 21
    maki   a2, a3, (1<<6) | 10   # decoder expands to `maki a2, a2, a3, (1<<6) | 10`
That's 22 bits of opcode space, the same as any one of `addi`, `andi`, `ori`, `xori`, `slti`, `sltiu` (OP-IMM) or `addiw` (OP-IMM-32).

The original RV64GC has 5/8 funct3 encodings in OP-IMM-32 unused, which `maki` (or call it `bfi` or whatever) could have used one of. It has a combined `Rd`/`Rs2` field which is unusual in full size 4-byte RISC-V instructions, but not unprecedented: the V extension does that for multiply-add instructions.

I don't immediately see any ratified or currently-proposed extension using this space.


What would justify using this significant space for them these days? Video encoding/decoding in software seems like the most likely candidate, since there's a lot of bitfield packing and high data volume.

(Thanks for your elaboration on various architectures. It's an interesting glimpse into what goes in in allocating opcode space on fixed-length instruction machines.)


My example is applicable to compiler / assembler / JIT / emulator.

The performance of conventional compilers and assemblers is not important to anyone but developers, but everyone uses JavaScript / WebAsm all the time. And QEMU can be important too (e.g. in docker for non-native ISAs, using binfmt_misc).

I guess I should point out in the proposed RISC-V example, it's 6 bytes of code as the initial shift can be a 2-byte "C" extension instruction. So that's slightly smaller code than everything except 32 bit PowerPC, which is another important aspect. Arm64 and M68k use 8 bytes of code.

Oh! I just realised standard RISC-V can be improved in this case (but not by so much in the general case).

    srli   x12, x10, 20          # shift field down to correct position
    andi   x12, x12, 0x7FE       # mask to 10 bits
    andi   x11, x11, ~0x7FE      # clear space in the destination
    or     x11, x11, x12         # insert the field
That's just 12 bytes of code.

In the more general case you need a `lui` or `lui;andi` pair to load the mask into a register, and then register to register ops, for 14 bytes total.

Note that x86_64 needs four instructions and 14 bytes of code, so no better than RISC-V.


Compared to Delhi? Ok. But I've had a soaking uncomfortable shirt every time I've been to Vegas, while in Phoenix it evaporates quickly.


I was also on BIX, then NLZ, same name as here. Even made it into the "Best of BIX" in the back of BYTE a couple of times.

Living in New Zealand, it wasn't easy to meet people — or for that matter to access BIX! I was fortunate that from mid 1986 my employer paid for access via X.25 [1] for several years until telnet was possible from Actrix BBS.

jdow took me to LASFS once in 1989 and I think I saw JP from a distance. But in 2004 I spontaneously caught a flight to LA for the historic SpaceShip One 100km high flight. jpistritto picked me up at the airport and we drove to Mojave. Parking at the XCor hangar david42 and his wife Rita pulled up next to us in an RX7. There was a party in the hangar that evening, I got to talk with JP and LN and many others, at one point helped Doug Jones (can't remember if he was on bix) make LN2 icecream. A lot of us slept in the hangar. In the morning I helped shadow cook bacon&eggs for everyone, before we all went out to watch the flight.

Also at other times got to meetups in Phoenix, New York (a lot of C++ crowd there), New Haven (people came down from Boston), Seattle.

Good times.

[1] NZ$13.20 per kilosegment (ISTR even more at first!) .. up to 64k bytes if you filled the packets, but possibly as little as 1000 bytes if there was only 1 byte per packet e.g. sitting there and hitting return: so I always filed all new messages to scratchpad and then did either SHOW or else download via X/Y/Z modem.


Interesting (but understandable pre-silicon) to see a couple of errors about the 6502 in that e.g. SBC needs SEC before it not CLC. The code examples could be improved too e.g. the 6502 memory copy has no need to use both index registers and increment them in lockstep with the same values. And better still, since you're copying fewer than 256 bytes, initialize one index register to COUNT-1 and copy from last to first.

On the other hand the 6800 code is buggy too. It's incrementing only one byte of the FROM and TO pointers — and the MSB at that on a bigendian machine — with no provision for crossing a page boundary, when the normal thing is to

    LDX FROM
    LDA 0,X
    INX
    STX FROM
    LDX TO
    STA 0,X
    INX
    STX TO
Still, as they say, much messier than 6502's...

    LDA FROM,X
    STA TO,X
    INX
... even if the 6502 needs an outer loop to copy more than 256 bytes, at least the inner loop is fast.

Also no mention is made of `(ZP),Y` addressing mode which takes 6502 to another level entirely.


It became just another MS-DOS rag. In the early days it covered EVERYTHING, all ISAs, all programming languages (very famous Lips and Forth and Smalltalk issues, for example).

See: https://news.ycombinator.com/item?id=47829410


Yes, its fascinating that it went where the market was driven - by the markets hooks in its own advertising pages - and in that capacity, BYTE became a driving force for the early computing revolution not just (but also because of) the readership, but also their advertisers - inasmuch as that revolution could be defined as "wide adoption of new and emerging technologies to form a standard" - BYTE started as a user manual and ended its existence as a catalog of things with user manuals.

Probably, if one thinks about it, one of the more eloquent data structures in human existence, BYTE.


> CP/M did have a sort of revival in that it became common in low-end machines like the C-128

Amstrad were good late 80s CP/M machines. We got both those and C128 in New Zealand.

> the one RISC/CISC CPU thing that really mattered!

Not only indirect addressing, but also multiple memory operands in the same instruction — more than one VM page, really, though a single unaligned operand crossing a page boundary is also bad. Many machines trap on that case to this day and let software emulate it.

Not being able to easily tell how long an instruction is (and thus where the next one starts) is also bad, but can be overcome at some cost in the front end, and the back end is unaffected. Unlike x86 and VAX the 68k does actually tell you everything you need in the first 16 bits, but yes the complex addressing of the 020/030 were what killed it.


Yeah, I've got a complete collection from October 1978 to December 1991 (by which time they had became just another x86 PC rag). I bought a fair few individual copies myself from 78 or 79 until the late 80s, but the bulk of my collection I got for free from an elderly engineer in I think the late 2000s.

Here's a tweet I made packing them up when I was moving overseas in April 2015:

https://x.com/BruceHoult/status/586675607087419394/photo/1

I also have a 1984 Encyclopædia Britannica, all 30 volumes.

Will anyone want them when I can't house them?


Nobody wants those things - but some may need them.

What do I mean? If you posted them for free you might get a taker, but probably not.

But it's possible you might identify the right child at the right time who could appreciate them.

1991 Byte might be too old, but a 1984 Britannica has something even an offline copy of Wikipedia doesn't.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: