Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Node Streams: How do they work? (maxogden.com)
72 points by bpierre on April 17, 2012 | hide | past | favorite | 15 comments


Streams are nice but in my experience there aren't many modules that let you use them. Most networking protocol modules will only let you specify a hostname and port to connect to, and not pass a stream to use. That's fine for the common case, but maybe I want to do something fun like send data encrypted with ssl, which is encoded using base 64, then stuffed in a dns query, then tunnelled over socks, via a unix pipe (and receive the reply via a completely different stack). It wouldn't be hard to chain streams together to achieve that, but then I'd have to rewrite modules to actually use it.

Even nodes own http module only does standard tcp. If it supported arbitrary streams then it'd be much more flexible, and we wouldn't need to do things like use a separate module to support https.

The same with the server part of these modules. They normally only let you specify a tcp port to listen to. Ideally there should be some sort of "stream server" interface that automatically sets up the stream chain and emits a connection event. The module could then listen for that event and use the corresponding stream for transferring the data. That way the module would be completely independent of the underlying protocol.


LuaSocket's ltn12 module might interest you:

http://w3.impa.br/~diego/software/luasocket/ltn12.html


Reminds me much of channels in Go: http://golang.org/doc/effective_go.html#channels

Are Node streams pure, generic JavaScript? Do they rely much on Node internals or outside libs?

I ask because I experimented with "porting" Go channels to generic JavaScript in http://www.johntantalo.com/blog/go-flavored-javascript/


From what I remember when I looked at Node's Stream source, I believe they're pure JS. They depend on the EventEmitter and they basically emit a "data" event with a chunk of data as many times as needed.

My main question is that since it's an instance of EventEmitter, and EventEmitter uses a synchronous loop to "emit" events, wouldn't stream-intensive code be slowed down by lots of requests due to the synchronous nature of EventEmitter?


> Reminds me much of channels in Go: http://golang.org/doc/effective_go.html#channels

Or producers and consumers in Twisted: http://twistedmatrix.com/documents/current/core/howto/produc...


This API appears to match the Stream API nearly perfectly.

Max, you should add a reference to Twisted Producers and Consumers API in the article as an example of a similar interface.


they are pure js, though streams in node emit Buffer objects which can be mostly emulated using html5 Int32Arrays


Had to look that up,

> A Buffer is similar to an array of integers but corresponds to a raw memory allocation outside the V8 heap. http://nodejs.org/docs/v0.3.1/api/buffers.html

Are streams limited to binary buffers, or may they also emit JavaScript objects or primitives? Binary may be great for low-level I/O streams, but I'd expect more nuanced events for high-level streams.


You can emit anything but it just so happens that 99% of the streams used in node core are Buffers for performance reasons (Buffers are allocated outside of V8's heap). An example of emitting objects is in here https://github.com/maxogden/dominode/blob/master/example.htm...


Node.js Buffers are actually built on top of V8's Uint8Array. Using 32-bit integers to represent 8-bit values seems quite wasteful.


Source please? The docs say a Buffer is "outside the V8 heap".


Yes, the chunk of memory is managed by node, but each Buffer instance is linked with a slice of the heap through `v8::Object->SetIndexedPropertiesToExternalArrayData()`, the same way a JS `Uint8Array` would.

Here's the relevant link: https://github.com/joyent/node/blob/master/src/node_buffer.c....


Using Node has helped me think more in terms of streams because the APIs afford such thinking. PHP has streams but I never thought to use them, and I haven't really seen it done.

Node Streams really help performance in my application where I am proxying API calls from the browser to the Dropbox API. It's much faster to stream the response from Dropbox through to the client rather than request, wait for entire response, then send.


I've had trouble wrapping my head around how to use one stream to filter data. I'd like to be able to write to a stream and read from it to get the filtered data but the api only seems to lend itself to using two streams (think stdin and stdout).


magnets.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: