Patrick, Brian, sorry ‘bout that! I messed around with a bunch of different tuning parameters while testing, but accidentally committed with these params. Neither prevents this kind of data loss (the queue is still routable on those nodes, so mandatory doesn’t help AFAICT, and more mirrors just means more copies to get out of sync), but I’ve updated the source to bring back these parameter choices. Good catch, thank you! :)
Brian, on
If you’re going for maximum reliability wouldn’t you want to set the :mandatory flag on publishes?
Marc: Your version is just fine for this stage of the book. More idiomatically, you can use (mapcat …) in place of (apply concat (map …)). Probably the cleanest solution given the tools we’ve explored so far in the book would be
which currently reads:
“{\"ha-mode\”: \“exactly\”, “ha-params\”: 2
This mirrors a queue between the node that its created on and one other.
Lee Spector, on
I really like this series of posts and I’m considering recommending it to my students in a Clojure-based AI course in the fall. Is there a table of contents page that I’ve missed? Easy to create externally, of course, but it’d be nice to have one on the site if there isn’t one already. FWIW I created something that uses an approach that’s similar in some ways (clojinc, at github/lspector), but whereas I provided almost no text at all between code snippets, you’ve provided lots and lots of helpful text. Thanks!
computer repair, on
Dude! this is terrific read im interested highly on the development of yourkit
brad quan, on
That’s fantastic….. thanks for your sharing.
Samuel Trille, on
Dude,
You can have this result with Erlang from the very beginning in 2 hours ;__;
Good luck with your quest to find truth.
Bill Smith, on
After you finish beating up on RabbitMQ, I would enjoy hearing more about what you discovered with YourKit.
Daniel, on
I love this series. Thank you. A few suggestions for making it better:
There are two typos in the narrative that get corrected in the review, but which made following along a bit more difficult:
apoapsis uses (map altitude trajectory) in your narrative and (map altitude (flight trajectory)) in review, and my tests were running forever
ascent and circularization have typos in the narrative which prevent the circularization stage from ever being triggered - 3000 instead of 300 and 4000 instead of 400
I would also recommend promising the orientation function later while you’re talking about unit-vector and engine-force under Orbital insertion. While I was working on this I thought you had just forgotten to include it, and tried to work it out myself.
Thanks again!
Samridh, on
Well, now I feel stupid :) Just read the whole thing. Ignore my comment.
Samridh, on
Hi! Really love the series of posts.
Just a tiny nit:
(repeat n x): repeats x, n times
so pow should be defined as:
(defn pow
“Raises base to the given power. For instance, (pow 3 2) returns three squared, or nine.”
base power))
Right?
Tim B, on
Hi, great post.
Do you know what happens if the net connection between two nodes goes down in a three node ensemble? Say n1 and n2 can communicate, and n2 and n3 can communicate, but n1 cannot reach n3 - that’s not a partition or node failure. Can n1 sync with n3 via n2 forwarding messages?
Aphyr, on
How do the “ifs” in your serializability example force an order?
They force an order because there is only one sequence in which those operations could take place. If the precondition is not met, the operation can’t happen. I chose those specific operations because each operation requires exactly one unique state as a precondition, and each generates one unique state, and those states align in only one way. In general, serializability admits multiple orders, but I chose those to illustrate that serializability may be used to enforce very specific orderings as well.
Aphyr, on
If you can delay without “timeout”, it seems one hiccup would stall the whole thing.
Typically, one would only perform causally subsequent values once their dependencies were acknowledged as durable by the database. If the dependencies are durable, they’ll eventually (once partitions have resolved) become visible. If the dependencies aren’t durable, you’ve only got two options: produce inconsistent histories or consider the dependent operations uncommitted. The former preserves availability at the cost of consistency; the second preserves consistency but then you don’t see values. Same tradeoff you make in any system, really.
Aphyr, on
PRAM consistency requires that all nodes observe all writes from a given node in the same order (Lipton, Sandberg 1988). While what you describe as causal consistency is some consistency mode (though classically this is not causal consistency), it cannot in fact be stronger than PRAM consistency as your figure suggests.
Great post! Could you please link to the literature for people who are interested in learning more ?
Michael Yeaney, on
Question regarding causal relationships:
“…the database can delay making operations visible until it has all the operation’s dependencies”. This seems to assume all operations eventually show up; which, if a client has no retry logic, may not happen (network timeout, crash, etc). In this case, is the database doomed forever? If you can delay without “timeout”, it seems one hiccup would stall the whole thing.
I posted the same comment to the the recent ACM Queue article on causal enforcement..something seems fishy to me when using delays / etc. Where am I going wrong?
mp, on
How do the “ifs” in your serializability example force an order? Are the supposed to mean something else than ‘if condition then do something else do nothing’? It is not really explicit from the text.
theja, on
Would love to go live there someday :D
Aurojit Panda, on
Essentially what you describe by implicit causal relationships are required by PRAM consistency. If causal consistency is stronger than PRAM consistency, PRAM consistency should be implement able using any causally consistent system, requiring that causally consistent ordering correctly order writes from a single node as observed by any other node.
Aurojit Panda, on
Actually your answer to Colin’s question disagrees with the consistency hierarchy figure in your article: PRAM consistency requires that all nodes observe all writes from a given node in the same order (Lipton, Sandberg 1988). While what you describe as causal consistency is some consistency mode (though classically this is not causal consistency), it cannot in fact be stronger than PRAM consistency as your figure suggests.
Aphyr, on
While the operations from a single process do happen in order on that node, they are not required to happen in that order everywhere. Sequential consistency enforces that constraint; causal consistency does not. Only explicit causal relationships are used for ordering invariant in a causally consistent system, as opposed to the implicit ordering by process identifier guaranteed by a sequentially consistent system.
Colin Scott, on
Quick clarification: are you assuming something less conservative than potential causality (happens-before) while referring to “operations from the same process with independent causal chains”? I’m having a hard time thinking of a case where any two operations from the same machine are independent (concurrent) under potential causality, since one must have happened before the other.
Aphyr, on
However it could be made to work correctly only if you enforced a sane quorum
definition, such as mainly with “min-slaves-to-write 2”.
I have serious doubts about Redis Sentinel independent of the quorum definition.
flo, on
Hey,
your posts are my first dive into clojure and you’re doing a really good job at explaining. Keep up the great work!!!
flo
tobi, on
This is quite funny: I just re-read this article, not remembering I had read it before. And again, I had to delete the animated pictures. Not remembering I had done exactly that before. And now I see my old comment. Till next time, tobi!
Tom, on
This is brilliantly written. Thank you very much for your effort in producing it.
Fran Burstall, on
Great tutorial but a mistake in the delay/deref example: the string that is printed should match the string “computing a really big number!” in the definition of x.
Russell Brown, on
Hey, very very late to respond to Aphyrs comment of May 2013 (what’s a year between friends?) I see what you’re getting at, but to correctly mutate CRDT state at the client so that it is idempotent, you must have the last thing you wrote: so either you need RYOW for a fetch-merge-mutate cycle, or some durability at the client. I like this post a lot, by the way, in case you couldn’t tell (reading it again in 2014 :D)
Rahul Saini, on
Applications using MongoDB must be very carefully while updating. An update caused data loss: scenario here is UPDATE of a document.
I used MongoDB 2.4 for this and ran the example given at http://www.javabeat.net/developing-a-simple-todo-application-using-javafx-java-and-mongodb-part-2/
It is a simple MongoDB Java driver app but on UPDATE of document it (mongoDB) (maybe its the java driver) silently updates your document with a empty document!
(Below is an excerpt of comments I left at the above URL)
UPDATE Method causes DATA LOSS !
in the ToDoAppDAO Class, the method
public static void setTodoAsCompleted(ToDoApp todoRef) throws UnknownHostException{ .. .. } does not work as expected.
It searches the ToDo by id and REPLACES IT with A NEW document !!!! which is not what the update should be doing !
so , say for example this document :
{ “_id” : ObjectId(“52a0007de57fc7118ca58228″), “task” : “Finish MongoDB Chapter 4″, “taskNote” : “Document Oriented Data”, “completed” : false, “added” : ISODate(“2013-12-05T04:26:37.552Z”) }
gets updated to this :
{ “_id” : ObjectId(“52a0007de57fc7118ca58228″), “completed” : true }
the method should update the document using the $set operator as follows:
BasicDBObject updateValue = new BasicDBObject();
updateValue.append(“$set”, new BasicDBObject().append(“completed”, true));
BasicDBObject searchQuery = new BasicDBObject().append(“id”, new ObjectId(todoRef.getId()));
collection.update(searchQuery, updateValue); - See more at: http://www.javabeat.net/developing-a-simple-todo-application-using-javafx-java-and-mongodb-part-2/#sthash.MD7welyx.dpuf
Aphyr, on
Just a question, do you have anything against records and protocols?
Protocols are strictly for polymorphism, and since we don’t have different types of records here, there’s no need for a protocol. We’ll be addressing polymorphism in a later chapter, though. :)
Records have two advantages over maps: first, lookup for the basis fields in a record is an order of magnitude faster than a hashmap, and second, records can specify implementations of polymorphic functions (namely, via protocols and interfaces). They don’t provide type safety for their fields (you can happily assign a Cat to a Dog field) and don’t constrain their fields (you can assoc arbitrary keys into a record, just like a map), so they don’t really provide the sort of type safety you’d be looking for in a typed Object or algebraic datatype.
Because Records print differently and have a few subtle API differences, and because I’m trying to give learners a chance to settle in to the core data abstractions before moving up to polymorphism and types, this chapter sticks to maps. :)
Patrick, Brian, sorry ‘bout that! I messed around with a bunch of different tuning parameters while testing, but accidentally committed with these params. Neither prevents this kind of data loss (the queue is still routable on those nodes, so mandatory doesn’t help AFAICT, and more mirrors just means more copies to get out of sync), but I’ve updated the source to bring back these parameter choices. Good catch, thank you! :)
If you’re going for maximum reliability wouldn’t you want to set the :mandatory flag on publishes?
https://github.com/aphyr/jepsen/blob/master/rabbitmq/src/jepsen/system/rabbitmq.clj#L153
Marc: Your version is just fine for this stage of the book. More idiomatically, you can use (mapcat …) in place of (apply concat (map …)). Probably the cleanest solution given the tools we’ve explored so far in the book would be
(defn filter [f coll] (reduce (fn [results x] (if (f x) (conj results x) results)) coll))But if I were writing this in idiomatic Clojure, I’d use a recursive lazy-seq, which we haven’t covered yet. ;-)
“We’ll be using durable, triple-mirrored writes”
It looks like you only have mirroring configured to two nodes:
https://github.com/aphyr/jepsen/blob/master/rabbitmq/src/jepsen/system/rabbitmq.clj#L82
which currently reads: “{\"ha-mode\”: \“exactly\”, “ha-params\”: 2
This mirrors a queue between the node that its created on and one other.
I really like this series of posts and I’m considering recommending it to my students in a Clojure-based AI course in the fall. Is there a table of contents page that I’ve missed? Easy to create externally, of course, but it’d be nice to have one on the site if there isn’t one already. FWIW I created something that uses an approach that’s similar in some ways (clojinc, at github/lspector), but whereas I provided almost no text at all between code snippets, you’ve provided lots and lots of helpful text. Thanks!
Dude! this is terrific read im interested highly on the development of yourkit
That’s fantastic….. thanks for your sharing.
Dude, You can have this result with Erlang from the very beginning in 2 hours ;__; Good luck with your quest to find truth.
After you finish beating up on RabbitMQ, I would enjoy hearing more about what you discovered with YourKit.
I love this series. Thank you. A few suggestions for making it better:
There are two typos in the narrative that get corrected in the review, but which made following along a bit more difficult:
I would also recommend promising the orientation function later while you’re talking about unit-vector and engine-force under Orbital insertion. While I was working on this I thought you had just forgotten to include it, and tried to work it out myself.
Thanks again!
Well, now I feel stupid :) Just read the whole thing. Ignore my comment.
Hi! Really love the series of posts. Just a tiny nit: (repeat n x): repeats x, n times so pow should be defined as: (defn pow “Raises base to the given power. For instance, (pow 3 2) returns three squared, or nine.” base power)) Right?
Hi, great post.
Do you know what happens if the net connection between two nodes goes down in a three node ensemble? Say n1 and n2 can communicate, and n2 and n3 can communicate, but n1 cannot reach n3 - that’s not a partition or node failure. Can n1 sync with n3 via n2 forwarding messages?
They force an order because there is only one sequence in which those operations could take place. If the precondition is not met, the operation can’t happen. I chose those specific operations because each operation requires exactly one unique state as a precondition, and each generates one unique state, and those states align in only one way. In general, serializability admits multiple orders, but I chose those to illustrate that serializability may be used to enforce very specific orderings as well.
Typically, one would only perform causally subsequent values once their dependencies were acknowledged as durable by the database. If the dependencies are durable, they’ll eventually (once partitions have resolved) become visible. If the dependencies aren’t durable, you’ve only got two options: produce inconsistent histories or consider the dependent operations uncommitted. The former preserves availability at the cost of consistency; the second preserves consistency but then you don’t see values. Same tradeoff you make in any system, really.
See http://www.ics.forth.gr/tech-reports/2013/2013.TR439_Survey_on_Consistency_Conditions.pdf for a more detailed explanation of why causal consistency is stronger than PRAM; in specific, causally related operations may be handed off through intermediate nodes; PRAM does not enforce causal consistency’s transitivity between operations.
Great post! Could you please link to the literature for people who are interested in learning more ?
Question regarding causal relationships:
“…the database can delay making operations visible until it has all the operation’s dependencies”. This seems to assume all operations eventually show up; which, if a client has no retry logic, may not happen (network timeout, crash, etc). In this case, is the database doomed forever? If you can delay without “timeout”, it seems one hiccup would stall the whole thing.
I posted the same comment to the the recent ACM Queue article on causal enforcement..something seems fishy to me when using delays / etc. Where am I going wrong?
How do the “ifs” in your serializability example force an order? Are the supposed to mean something else than ‘if condition then do something else do nothing’? It is not really explicit from the text.
Would love to go live there someday :D
Essentially what you describe by implicit causal relationships are required by PRAM consistency. If causal consistency is stronger than PRAM consistency, PRAM consistency should be implement able using any causally consistent system, requiring that causally consistent ordering correctly order writes from a single node as observed by any other node.
Actually your answer to Colin’s question disagrees with the consistency hierarchy figure in your article: PRAM consistency requires that all nodes observe all writes from a given node in the same order (Lipton, Sandberg 1988). While what you describe as causal consistency is some consistency mode (though classically this is not causal consistency), it cannot in fact be stronger than PRAM consistency as your figure suggests.
While the operations from a single process do happen in order on that node, they are not required to happen in that order everywhere. Sequential consistency enforces that constraint; causal consistency does not. Only explicit causal relationships are used for ordering invariant in a causally consistent system, as opposed to the implicit ordering by process identifier guaranteed by a sequentially consistent system.
Quick clarification: are you assuming something less conservative than potential causality (happens-before) while referring to “operations from the same process with independent causal chains”? I’m having a hard time thinking of a case where any two operations from the same machine are independent (concurrent) under potential causality, since one must have happened before the other.
I have serious doubts about Redis Sentinel independent of the quorum definition.
Hey,
your posts are my first dive into clojure and you’re doing a really good job at explaining. Keep up the great work!!!
flo
This is quite funny: I just re-read this article, not remembering I had read it before. And again, I had to delete the animated pictures. Not remembering I had done exactly that before. And now I see my old comment. Till next time, tobi!
This is brilliantly written. Thank you very much for your effort in producing it.
Great tutorial but a mistake in the delay/deref example: the string that is printed should match the string “computing a really big number!” in the definition of x.
Hey, very very late to respond to Aphyrs comment of May 2013 (what’s a year between friends?) I see what you’re getting at, but to correctly mutate CRDT state at the client so that it is idempotent, you must have the last thing you wrote: so either you need RYOW for a fetch-merge-mutate cycle, or some durability at the client. I like this post a lot, by the way, in case you couldn’t tell (reading it again in 2014 :D)
Applications using MongoDB must be very carefully while updating. An update caused data loss: scenario here is UPDATE of a document. I used MongoDB 2.4 for this and ran the example given at http://www.javabeat.net/developing-a-simple-todo-application-using-javafx-java-and-mongodb-part-2/ It is a simple MongoDB Java driver app but on UPDATE of document it (mongoDB) (maybe its the java driver) silently updates your document with a empty document!
(Below is an excerpt of comments I left at the above URL)
UPDATE Method causes DATA LOSS ! in the ToDoAppDAO Class, the method public static void setTodoAsCompleted(ToDoApp todoRef) throws UnknownHostException{ .. .. } does not work as expected. It searches the ToDo by id and REPLACES IT with A NEW document !!!! which is not what the update should be doing ! so , say for example this document : { “_id” : ObjectId(“52a0007de57fc7118ca58228″), “task” : “Finish MongoDB Chapter 4″, “taskNote” : “Document Oriented Data”, “completed” : false, “added” : ISODate(“2013-12-05T04:26:37.552Z”) } gets updated to this : { “_id” : ObjectId(“52a0007de57fc7118ca58228″), “completed” : true } the method should update the document using the $set operator as follows: BasicDBObject updateValue = new BasicDBObject(); updateValue.append(“$set”, new BasicDBObject().append(“completed”, true)); BasicDBObject searchQuery = new BasicDBObject().append(“id”, new ObjectId(todoRef.getId())); collection.update(searchQuery, updateValue); - See more at: http://www.javabeat.net/developing-a-simple-todo-application-using-javafx-java-and-mongodb-part-2/#sthash.MD7welyx.dpuf
Protocols are strictly for polymorphism, and since we don’t have different types of records here, there’s no need for a protocol. We’ll be addressing polymorphism in a later chapter, though. :)
Records have two advantages over maps: first, lookup for the basis fields in a record is an order of magnitude faster than a hashmap, and second, records can specify implementations of polymorphic functions (namely, via protocols and interfaces). They don’t provide type safety for their fields (you can happily assign a Cat to a Dog field) and don’t constrain their fields (you can assoc arbitrary keys into a record, just like a map), so they don’t really provide the sort of type safety you’d be looking for in a typed Object or algebraic datatype.
Because Records print differently and have a few subtle API differences, and because I’m trying to give learners a chance to settle in to the core data abstractions before moving up to polymorphism and types, this chapter sticks to maps. :)