Colt McNealy’s Post

View profile for Colt McNealy, graphic

Founder of LittleHorse

<rant subject = #streaming and #queues> Streams have stronger ordering guarantees than message queues. Generally this means that all messages within a partition are read in the same order. This is highly useful when you need an ordered record of events that happened. As a consequence, streaming systems don't have individual message acknowledgement as it would violate ordering guarantees. Message queues have per-message acknowledgement, which means they have less strict ordering guarantees. One hybrid approach which is quite cool is "key-based subscriptions" in Apache Pulsar and also Confluent's Parallel Consumer for #apachekafka which guarantee order for all messages with the same key while allowing you to individually acknowledge messages. The tradeoff with individual message ack's is persistence (disk or memory)—you need to keep track of which messages have been acknowledged and which ones are still inflight. Pulsar uses BookKeeper to persist "subscriptions." Parallel Consumer stores them in the user-defined metadata in the consumer offset topic. There's an in-flight KIP for "Queues on Kafka" (KIP-932) but if I were the Grumpy Maintainer of Kafka, I would say that it's unnecessary given that it's already possible to implement queue semantics on top of Kafka in the client side (see: Parallel Consumer). Why add additional code bloat on the broker side when the features are already accessible? </rant>

To view or add a comment, sign in

Explore topics