Do you know of any surveys conducted by the most prominent pollsters on # of households with guns in them? Of, course, criminals, and people afraid of their household being identified as one associated with gun ownership, would not respond. But, it still might provide interesting data.
I read that 40% of all American households had guns in them. I wonder where that stat came from?
Surprise123, on
I’m sure not a statistician, but reviewing your graphs makes me want to become one.
Oh, course, it would be great if all the stats represented here were more accurate, but, you work with what you’ve got, and vote for politicians who are interested in acquiring accurate data.
Other factors for study (if stats were available):
Homicide rates per 100 K verses % population that has been historically enslaved, or seriously oppressed (land taken away, insecure property rights, extra-judicial killings, etc.) by current polity within the past, say, 100, 300, and 500 years).
Homicide rates per 100 K versus % male population between the years of 13 and 25 without engaged law-abiding father figures in their lives (no engaged working fathers, no engaged coaches, no engaged clergy, no engaged mentors, no nothing).
Homicide rates per 100 K versus % population that has access to television, movies, and media that depict other people in their society with far more resources, material goods, and opportunities (not just wealth inequality, but widespread perception of wealth inequality).
edc, on
I’m assuming your gun ownership data is the 2005 data available on wikipedia. Is the GINI data IMF or worldbank from 2005?
Incidentally, the wiki data does have min max and average. No substitute for 5 summary of course but still.
Aphyr, on
Aaron, you’re looking for the “Crime in the United States” reports–the index is on that UCR page. I didn’t link the individual reports because a.) there’s a dozen of them, and b.) I had to trawl through old PDFs to get them. The FBI’s published data only goes back to 1992, sadly, but I did submit a request for their full crime statistics, including scanned PDFs back to 1930. That arrived this week; when I get a chance I’ll dig into it and make an expanded chart.
AMB, on
I don’t see the firearm murders per year by weapon type at the UCR link you provided. Can you provide a direct link to the numbers or instructions on where to find them on the UCR website?
Also, why did you choose 1992 (only two years before the AWB) as your start date?
Aphyr, on
“…I’m surprised by the fact that nobody remarks on the place of the US here : it appears to be in between developed countries and developing ones. For the superpower whose GDP per capita is among the highest, whose military is the strongest on earth, isn’t that a bit disquieting?”
Yeah, I’m unsettled too. I think most people should. Income inequality (by most metrics I’m aware of) has been rising for the past fifty years or so.
There’s an argument that increasingly industrialized markets will tend to have higher levels of income inequality as smaller pools of more skilled individuals take on the workloads that used to occupy many. Since technology amplifies the abilities of individuals, natural variation in our activities due to hard work, family background, dumb luck, etc, are amplified as well, resulting in more and more disparate incomes.
I’m not sure I buy this argument–my naive, unconsidered objection might start by noting that many heavily industrialized societies like Japan and Germany have much lower income and wealth inequalities than ours.
Regardless of whether there’s a causal link between advancing technology and income inequality, the combination of the two suggests an obvious policy. We could decide that as our collective powers increase, we can afford as a society to provide increasing levels of basic needs to every person: free housing, transport, food, healthcare, and education, financed by the generosity of the rich and by a large-scale welfare state. Even highly progressive tax curves retain some financial incentive for work: look at the US from 1914 to 1984.
Aphyr, on
“I’m curious about inequality as a predictor. GINI isn’t great. The folks at the Equality Trust in the UK might have an interesting approach to analyzing these data.”
Quite right. Both income and wealth distributions are measured differently in every country, and even if they were, the Laffer curve in both cases isn’t preserved by the Gini function. We should also note that hypothetical societies with completely even income distribution can show unequal wealth distribution due to savings. At best the Gini coefficient is only a rough approximation of relative access to capital, goods, and services. We could also be measuring a systematic bias: “countries with income reporting schemes which bias towards equality because of social or policy factors, also tend to have lower rates of gun crime,” that sort of thing.
jim, on
Sorry, you are way outa my league when it comes to stats, but I would love to see you take these same graphs and apply them to non-firearm crimes. Does rape and assault fall overall compared to gun ownership? How about burglary? Does homicide in general fall when there are less guns? It seems that the overall question we all would like answered is “do guns prevent or lower physical crime”?
For instance, in the UK or Australia, where there are very strict gun control laws, am I more likely to be the victim of a crime there?
I would really appreciate any kind of answer
Regards, Jim
Chaddaï, on
This is really interesting and those two graphs are effectively much closer to a “simple” correlation… but I’m surprised by the fact that nobody remarks on the place of the US here : it appears to be in between developed countries and developing ones. For the superpower whose GDP per capita is among the highest, whose military is the strongest on earth, isn’t that a bit disquieting ?
Not that I’m throwing the stone, my country isn’t so much better and has been sliding toward greater inequalities along the rest of the western world for several decades now but still, nobody is shocked by these graphs ?
Jesse , on
Its also at least makes an attempt to analyze data and numbers more deeply than most large media/blogs opinion pieces which have some presumably smart people doing analysis at the level of cavemen (2 is greater than 3 without mentioning a standard deviation).
Although the gun count for countries probably has issues in itself ( I imagine countries with stricter gun-regulation dont have as accurate a count because they likely have a larger black market)
Jesse, on
This is a much better and more interesting analysis than the one that inspired it which really is just various ways of rehashing the large amount of guns in the US.
Daniel Jordan, on
I’m curious about inequality as a predictor. GINI isn’t great. The folks at the Equality Trust in the UK might have an interesting approach to analyzing these data.
RoMa, on
Here are some figures for Norway:
The National Weapons Registry shows that there are 31 registered firearms per 1'000 persons in Norway, where the total population is 5'038'100.
There’s a total of 1'233'510 firearms in the Weapons Registry (this number doesn’t include military firearms, or privately owned shotguns bought before 1990), owned by a total of 485'170 persons, 438'000 of which are registered as hunters. Out of the latter, less than 200'000 pay the mandatory annual fee for active hunters, which means that most weapons are not being used at all.
Between 1991–2010, out of a total of 723 murders, 171 (23,7%) were committed with firearms.
There are no estimates for the number of illegal firearms in Norway, but it is believed that the total number of guns is significantly higher than the Weapons Registry indicates.
Aphyr, on
You’re right, David–there are several variables at play here. Obviously “Guns per capita” is a poor predictor of “deaths caused by a firearm”–the graph tells us that much. “Percentage of the population which owns guns” might be a better predictor, but since the fraction of the population which kill people with firearms is so small compared to the larger pool of gun owners, and criminals are less likely to report their gun use in surveys, that metric could have its difficulties too.
I should also be clear that this comparison tell us only about the relationship of gun prevalence in each country, not the policymaking question “How would the US behave if we changed the prevalence of guns?”
There’s a very simple and politically untenable solution to that question, which is simply to determine some observables like “amount of gun-related violence” and “amount of non-gun-related violence”, sign and implement a strict gun-control law–say, outlawing possession of handguns nationally–and watch what happens for five years. The more assertive the law is, the easier it’ll be to determine its effects.
Local bans won’t be sufficient: plenty of municipalities have imposed strict firearms ownership restrictions but without controlled borders these laws have little impact on firearm density. Surveys of felons who used firearms in an offense indicate cost is an important factor in weapon selection. A national buyback might raise the price of grey-market weapons used in crime, and cause substitution to other, less deadly weapons or crime.
This is certainly consistent with the NFA study, and perhaps with Britain’s firearms ban, but there’s a distinct lack of solid data. Since the US has so many guns, a unique mix of cultural factors, and is so large, I suspect controlled experiment could be the best way to determine the effects of policy measures.
David MacIver, on
One confounding factor is that guns per capita isn’t a terribly good measure of prevalence because it over-counts. Gun owners will often have more than one gun, and the number of guns per gun owner is almost certainly strongly positively correlated with the wealth of the country, which is in turn negatively correlated with violent crime.
I couldn’t seem to find any decent stats for percentage of the population who own guns for different countries unfortunately.
isomorphismes, on
A nice ante up from @mbusigin’s original. Maybe I will play too if I finish what I’m working on in time.
Nihil, on
A great read. Thank you very much!
Claire, on
This doesn’t even tell you what a cape is!!!!!!!
Derek, on
Awesome work, Kyle! I’ll throw this on a server by the weekend!
Aphyr, on
Yeah, folks do tend to best understand the languages they speak. I find Chinese pretty hard to follow, too!
That said, this is nontrivial Clojure code. It uses several parts of the stdlib including the STM and futures, has no comments at all, uses the same functions to do multiple things at once, and the variable names are extremely short. Why? Because this gist is a transcript of my REPL session–I wrote this code live and never intended it to be read. Production Clojure tends to look a little different. ;-)
The code lives in a namespace, which I’ve called “messagepassing.core”. It’s going to use the standard java library’s LinkedTransferQueue. We also set a constant ’m': the number of messages to exchange.
The main function can do two things: test the use of queues, or the use of atoms. It just looks at the first command-line argument to decide what to do.
queue-test defines a function “bounce”, which takes a number from one queue, increments it, and puts it back into the other queue. It keeps doing this until the number is bigger than m. In the node.js code this function is also called bounce, but it’s not recursive because Node is calling the function as a part of its IO loop. In fact you can’t express any other type of flow control in Node.
Then queue-test creates a pair of queues (q1 and q2) and a pair of workers (w1 and w2). The workers are just independent threads (futures) which bounce a number between queues. @w2 expands to (deref w2), and dereferencing a future means “wait for the future’s return value”. So in the main thread, @w2 waits until the test is complete. Then we clean up the other worker thread.
Atom-test uses the STM, so there’s no real node analogy here. It defines a function called bump, which increments an atomic reference ‘a’ until it reaches m. swap! is a clojure core function which changes the value of an atomic reference using a function of its current value. Since swap! also returns the value it set, we can use (when) to determine whether the threshold is reached–without race conditions.
The test creates a single atomic reference x to the number one, then spawns a pair of worker threads w1 and w2 which bump x until it reaches the threshold m. We resolve both futures, which blocks until each has determined that the atom’s value is now equal to m.
Hope that helps. :)
Aphyr, on
Great point, Joe. Storing timestamped transaction logs does not provide CP semantics; I was thinking purely of the transactional serializability. Readers can still miss an arbitrary amount of history, so the operational semantics are not CP.
One of the reasons I wrote this post, in addition to your talk, is that I’ve tried to build systems based on last-write-wins, and advocated using external coordinators for stronger consistency to others on the mailing list! I hadn’t considered that since Riak doesn’t yet provide negative guarantees around failures, you can’t control when your writes succeed and destroy other information! I’m excited to see your work on 2PC develop, even with the possibility of unattainable quorum, there are some practical partition cases where this will not arise and serializable semantics continue operating. And ultimately, safety is preserved–the most important property for CP structures. :)
jtuple, on
Good post. Especially great at showing the perils of using client-side locking for building stronger semantics into Riak. More on that later.
On your conclusion, queuing timestamped operations which are then processed in order does allow for essentially arbitrary types of data transformation that appears to be serializable consistent. The main problem is that you never know when you have read the final value. All partitions must heal and all data propagate before you’ll have an entire history to process. It’s still eventual consistency at the end of the day. This approach is also susceptible to write failures. As you know, Riak only provides positive guarantees. If a write succeeds, you know it was written. If a write fails, the write may or may not actually have been written.
In any case, a fine approximate solution that works well in many cases. However, I still believe stronger semantics remain necessary. The work I presented at RICON aims at providing immediate serializable consistency along with sane semantics around write failures. Yes, requests will be refused during partitions when necessary – it’s an A vs C tradeoff after all.
Back to locking and quorum issues, I’m glad I’m not the only one trying to educate the world on these issues. I’ve seen way too many attempts at building strong consistency on top of Riak using some sort of client-side coordinator. It’s just not possible.
As you mentioned, my work builds locks directly into Riak making things aware of cluster topology. That’s step one. The second issue is sloppy quorums. My work uses strict primary-only quorums. This isn’t PW/PR, this is new semantics that had to be added in Riak for this to work. By mapping primary replicas to our consensus ensemble, and requiring two-phase commit to change the membership (eg. when cluster ownership changes), we can handle writes during a partition. Provided a quorum of non-partitioned primaries exist of course. Yes, we will refuse reads/writes if a quorum can’t be reached, oh well we’re CP. At least the ring is broken into multiple ensembles and therefore failures only affect a subset of the overall keyspace.
Again, great post. The more people thinking/writing about this stuff the better.
Pooria, on
That’s a very good analysis. Thank you.
But there’s a problem: The Clojure code looks “literally” like gibberish to me. Now I happen to know Node, but even if I didn’t, I would very easily understand what the first gist is doing. I read the first two chapters of some Clojure book (or maybe online tutorial) very recently (and created like 10 simple programs with it) and yet the second gist makes absolutely no sense to me. I can understand fragments of it, but…
I’m not saying the syntax is everything. But the barrier to entrance to Node-land is much lower than that of Erlang’s or Clojure’s.
And good luck with those “health issues”.
derna, on
Were Clojure to cease, I would immediately endeavor to replicate its strengths in another language. That’s a primary language to me. ;-)
Toby DiPasquale, on
Good breakdown, but it assumes a zero-sum game which the market is not (wealth can be created and destroyed). Its pretty well known in finance circles that HFTs employ obfuscation methods in order to prevent other traders from arb'ing on their strategies. ETFs don’t do this which is why they are getting killed of late. Part of HFT is figuring out which arbs to engage in and the other part is figuring out what arbs the other guy is engaging in and beating him to the punch.
Aphyr, on
I’ve been kinda nuked by health issues this year, only got halfway to finishing aphyr.com. :-( Lots of folks have asked for a feed; it’s a high priority for me.
Ashwin Jayaprakash, on
No RSS feed?
Aphyr, on
Bascule, I think this is the best possible outcome. It’ll be really nice for folks who have invested time in learning Node to be able to take advantage of parallelism and keep using the same patterns and libraries. I imagine the work that’s being done on isolates will come into play as well.
Tony Arcieri, on
Oracle will be shipping an InvokeDynamic-based implementation of JavaScript with JDK8. It shouldn’t be too hard to reimplement the Node I/O and HTTP APIs using Java NIO and run Node.js applications on the JVM once that’s complete.
Aphyr, on
TJ, it’s great that folks are working on high-performance socket message passing–but I suspect you may not have read the post.
Regarding node isolates, I think that it would probably be less difficult to port node.js to a better VM than to modify the VM to support real or faked parallelism. Node does a few things right: smart packaging, a rich library of single-purpose tools at hand, a simple-to-understand concurrency model, etc. It also does some important things spectacularly wrong: being single-threaded, inheriting Javascript’s laughable type system, offering syntactically heavy closures as the end-all-be-all of nonlinear programming, etc. Since JS and Dart are single-threaded by specification, I have serious doubts about the semantics of parallel extensions to the core language.
Re: TLB shootdowns, thanks! This is honestly out of my depth, and I’ve read conflicting explanations as to what exactly triggers them. If I understand correctly, modern Linux kernels only need to reach an interprocessor barrier for TLB flushes when memory maps change. I assume that v8 makes relatively few allocations, and watching /proc/interrupts appears to confirm this: there are basically no TLB shootdowns for any of my message-passing examples above.
I agree that stateless servers are boring, but this is a.) good, because boring things are easy to understand, write, and analyze; and b.) many things can be accomplished with stateless components. Because shared state is complicated and expensive, I try to isolate the stateful part of my systems from the stateless ones. The stateless parts may well be good candidates for Node.js. That said, you’re correct to observe that stateless servers invariably offload their state-sharing to another system: a database, queue, internal services, third party APIs, and so on.
Do you know of any surveys conducted by the most prominent pollsters on # of households with guns in them? Of, course, criminals, and people afraid of their household being identified as one associated with gun ownership, would not respond. But, it still might provide interesting data.
I read that 40% of all American households had guns in them. I wonder where that stat came from?
I’m sure not a statistician, but reviewing your graphs makes me want to become one.
Oh, course, it would be great if all the stats represented here were more accurate, but, you work with what you’ve got, and vote for politicians who are interested in acquiring accurate data.
Other factors for study (if stats were available):
Homicide rates per 100 K verses % population that has been historically enslaved, or seriously oppressed (land taken away, insecure property rights, extra-judicial killings, etc.) by current polity within the past, say, 100, 300, and 500 years).
Homicide rates per 100 K versus % male population between the years of 13 and 25 without engaged law-abiding father figures in their lives (no engaged working fathers, no engaged coaches, no engaged clergy, no engaged mentors, no nothing).
Homicide rates per 100 K versus % population that has access to television, movies, and media that depict other people in their society with far more resources, material goods, and opportunities (not just wealth inequality, but widespread perception of wealth inequality).
I’m assuming your gun ownership data is the 2005 data available on wikipedia. Is the GINI data IMF or worldbank from 2005? Incidentally, the wiki data does have min max and average. No substitute for 5 summary of course but still.
Aaron, you’re looking for the “Crime in the United States” reports–the index is on that UCR page. I didn’t link the individual reports because a.) there’s a dozen of them, and b.) I had to trawl through old PDFs to get them. The FBI’s published data only goes back to 1992, sadly, but I did submit a request for their full crime statistics, including scanned PDFs back to 1930. That arrived this week; when I get a chance I’ll dig into it and make an expanded chart.
I don’t see the firearm murders per year by weapon type at the UCR link you provided. Can you provide a direct link to the numbers or instructions on where to find them on the UCR website?
Also, why did you choose 1992 (only two years before the AWB) as your start date?
“…I’m surprised by the fact that nobody remarks on the place of the US here : it appears to be in between developed countries and developing ones. For the superpower whose GDP per capita is among the highest, whose military is the strongest on earth, isn’t that a bit disquieting?”
Yeah, I’m unsettled too. I think most people should. Income inequality (by most metrics I’m aware of) has been rising for the past fifty years or so.
There’s an argument that increasingly industrialized markets will tend to have higher levels of income inequality as smaller pools of more skilled individuals take on the workloads that used to occupy many. Since technology amplifies the abilities of individuals, natural variation in our activities due to hard work, family background, dumb luck, etc, are amplified as well, resulting in more and more disparate incomes.
I’m not sure I buy this argument–my naive, unconsidered objection might start by noting that many heavily industrialized societies like Japan and Germany have much lower income and wealth inequalities than ours.
Regardless of whether there’s a causal link between advancing technology and income inequality, the combination of the two suggests an obvious policy. We could decide that as our collective powers increase, we can afford as a society to provide increasing levels of basic needs to every person: free housing, transport, food, healthcare, and education, financed by the generosity of the rich and by a large-scale welfare state. Even highly progressive tax curves retain some financial incentive for work: look at the US from 1914 to 1984.
“I’m curious about inequality as a predictor. GINI isn’t great. The folks at the Equality Trust in the UK might have an interesting approach to analyzing these data.”
Quite right. Both income and wealth distributions are measured differently in every country, and even if they were, the Laffer curve in both cases isn’t preserved by the Gini function. We should also note that hypothetical societies with completely even income distribution can show unequal wealth distribution due to savings. At best the Gini coefficient is only a rough approximation of relative access to capital, goods, and services. We could also be measuring a systematic bias: “countries with income reporting schemes which bias towards equality because of social or policy factors, also tend to have lower rates of gun crime,” that sort of thing.
Sorry, you are way outa my league when it comes to stats, but I would love to see you take these same graphs and apply them to non-firearm crimes. Does rape and assault fall overall compared to gun ownership? How about burglary? Does homicide in general fall when there are less guns? It seems that the overall question we all would like answered is “do guns prevent or lower physical crime”? For instance, in the UK or Australia, where there are very strict gun control laws, am I more likely to be the victim of a crime there? I would really appreciate any kind of answer Regards, Jim
This is really interesting and those two graphs are effectively much closer to a “simple” correlation… but I’m surprised by the fact that nobody remarks on the place of the US here : it appears to be in between developed countries and developing ones. For the superpower whose GDP per capita is among the highest, whose military is the strongest on earth, isn’t that a bit disquieting ? Not that I’m throwing the stone, my country isn’t so much better and has been sliding toward greater inequalities along the rest of the western world for several decades now but still, nobody is shocked by these graphs ?
Its also at least makes an attempt to analyze data and numbers more deeply than most large media/blogs opinion pieces which have some presumably smart people doing analysis at the level of cavemen (2 is greater than 3 without mentioning a standard deviation).
Although the gun count for countries probably has issues in itself ( I imagine countries with stricter gun-regulation dont have as accurate a count because they likely have a larger black market)
This is a much better and more interesting analysis than the one that inspired it which really is just various ways of rehashing the large amount of guns in the US.
I’m curious about inequality as a predictor. GINI isn’t great. The folks at the Equality Trust in the UK might have an interesting approach to analyzing these data.
Here are some figures for Norway:
The National Weapons Registry shows that there are 31 registered firearms per 1'000 persons in Norway, where the total population is 5'038'100.
There’s a total of 1'233'510 firearms in the Weapons Registry (this number doesn’t include military firearms, or privately owned shotguns bought before 1990), owned by a total of 485'170 persons, 438'000 of which are registered as hunters. Out of the latter, less than 200'000 pay the mandatory annual fee for active hunters, which means that most weapons are not being used at all.
Between 1991–2010, out of a total of 723 murders, 171 (23,7%) were committed with firearms.
There are no estimates for the number of illegal firearms in Norway, but it is believed that the total number of guns is significantly higher than the Weapons Registry indicates.
You’re right, David–there are several variables at play here. Obviously “Guns per capita” is a poor predictor of “deaths caused by a firearm”–the graph tells us that much. “Percentage of the population which owns guns” might be a better predictor, but since the fraction of the population which kill people with firearms is so small compared to the larger pool of gun owners, and criminals are less likely to report their gun use in surveys, that metric could have its difficulties too.
I should also be clear that this comparison tell us only about the relationship of gun prevalence in each country, not the policymaking question “How would the US behave if we changed the prevalence of guns?”
There’s a very simple and politically untenable solution to that question, which is simply to determine some observables like “amount of gun-related violence” and “amount of non-gun-related violence”, sign and implement a strict gun-control law–say, outlawing possession of handguns nationally–and watch what happens for five years. The more assertive the law is, the easier it’ll be to determine its effects.
Local bans won’t be sufficient: plenty of municipalities have imposed strict firearms ownership restrictions but without controlled borders these laws have little impact on firearm density. Surveys of felons who used firearms in an offense indicate cost is an important factor in weapon selection. A national buyback might raise the price of grey-market weapons used in crime, and cause substitution to other, less deadly weapons or crime.
This is certainly consistent with the NFA study, and perhaps with Britain’s firearms ban, but there’s a distinct lack of solid data. Since the US has so many guns, a unique mix of cultural factors, and is so large, I suspect controlled experiment could be the best way to determine the effects of policy measures.
One confounding factor is that guns per capita isn’t a terribly good measure of prevalence because it over-counts. Gun owners will often have more than one gun, and the number of guns per gun owner is almost certainly strongly positively correlated with the wealth of the country, which is in turn negatively correlated with violent crime.
I couldn’t seem to find any decent stats for percentage of the population who own guns for different countries unfortunately.
A nice ante up from @mbusigin’s original. Maybe I will play too if I finish what I’m working on in time.
A great read. Thank you very much!
This doesn’t even tell you what a cape is!!!!!!!
Awesome work, Kyle! I’ll throw this on a server by the weekend!
Yeah, folks do tend to best understand the languages they speak. I find Chinese pretty hard to follow, too!
That said, this is nontrivial Clojure code. It uses several parts of the stdlib including the STM and futures, has no comments at all, uses the same functions to do multiple things at once, and the variable names are extremely short. Why? Because this gist is a transcript of my REPL session–I wrote this code live and never intended it to be read. Production Clojure tends to look a little different. ;-)
Maybe some narration will help.
https://gist.github.com/3200862
The code lives in a namespace, which I’ve called “messagepassing.core”. It’s going to use the standard java library’s LinkedTransferQueue. We also set a constant ’m': the number of messages to exchange.
The main function can do two things: test the use of queues, or the use of atoms. It just looks at the first command-line argument to decide what to do.
queue-test defines a function “bounce”, which takes a number from one queue, increments it, and puts it back into the other queue. It keeps doing this until the number is bigger than m. In the node.js code this function is also called bounce, but it’s not recursive because Node is calling the function as a part of its IO loop. In fact you can’t express any other type of flow control in Node.
Then queue-test creates a pair of queues (q1 and q2) and a pair of workers (w1 and w2). The workers are just independent threads (futures) which bounce a number between queues. @w2 expands to (deref w2), and dereferencing a future means “wait for the future’s return value”. So in the main thread, @w2 waits until the test is complete. Then we clean up the other worker thread.
Atom-test uses the STM, so there’s no real node analogy here. It defines a function called bump, which increments an atomic reference ‘a’ until it reaches m. swap! is a clojure core function which changes the value of an atomic reference using a function of its current value. Since swap! also returns the value it set, we can use (when) to determine whether the threshold is reached–without race conditions.
The test creates a single atomic reference x to the number one, then spawns a pair of worker threads w1 and w2 which bump x until it reaches the threshold m. We resolve both futures, which blocks until each has determined that the atom’s value is now equal to m.
Hope that helps. :)
Great point, Joe. Storing timestamped transaction logs does not provide CP semantics; I was thinking purely of the transactional serializability. Readers can still miss an arbitrary amount of history, so the operational semantics are not CP.
One of the reasons I wrote this post, in addition to your talk, is that I’ve tried to build systems based on last-write-wins, and advocated using external coordinators for stronger consistency to others on the mailing list! I hadn’t considered that since Riak doesn’t yet provide negative guarantees around failures, you can’t control when your writes succeed and destroy other information! I’m excited to see your work on 2PC develop, even with the possibility of unattainable quorum, there are some practical partition cases where this will not arise and serializable semantics continue operating. And ultimately, safety is preserved–the most important property for CP structures. :)
Good post. Especially great at showing the perils of using client-side locking for building stronger semantics into Riak. More on that later.
On your conclusion, queuing timestamped operations which are then processed in order does allow for essentially arbitrary types of data transformation that appears to be serializable consistent. The main problem is that you never know when you have read the final value. All partitions must heal and all data propagate before you’ll have an entire history to process. It’s still eventual consistency at the end of the day. This approach is also susceptible to write failures. As you know, Riak only provides positive guarantees. If a write succeeds, you know it was written. If a write fails, the write may or may not actually have been written.
In any case, a fine approximate solution that works well in many cases. However, I still believe stronger semantics remain necessary. The work I presented at RICON aims at providing immediate serializable consistency along with sane semantics around write failures. Yes, requests will be refused during partitions when necessary – it’s an A vs C tradeoff after all.
Back to locking and quorum issues, I’m glad I’m not the only one trying to educate the world on these issues. I’ve seen way too many attempts at building strong consistency on top of Riak using some sort of client-side coordinator. It’s just not possible.
As you mentioned, my work builds locks directly into Riak making things aware of cluster topology. That’s step one. The second issue is sloppy quorums. My work uses strict primary-only quorums. This isn’t PW/PR, this is new semantics that had to be added in Riak for this to work. By mapping primary replicas to our consensus ensemble, and requiring two-phase commit to change the membership (eg. when cluster ownership changes), we can handle writes during a partition. Provided a quorum of non-partitioned primaries exist of course. Yes, we will refuse reads/writes if a quorum can’t be reached, oh well we’re CP. At least the ring is broken into multiple ensembles and therefore failures only affect a subset of the overall keyspace.
Again, great post. The more people thinking/writing about this stuff the better.
That’s a very good analysis. Thank you.
But there’s a problem: The Clojure code looks “literally” like gibberish to me. Now I happen to know Node, but even if I didn’t, I would very easily understand what the first gist is doing. I read the first two chapters of some Clojure book (or maybe online tutorial) very recently (and created like 10 simple programs with it) and yet the second gist makes absolutely no sense to me. I can understand fragments of it, but…
I’m not saying the syntax is everything. But the barrier to entrance to Node-land is much lower than that of Erlang’s or Clojure’s.
And good luck with those “health issues”.
Were Clojure to cease, I would immediately endeavor to replicate its strengths in another language. That’s a primary language to me. ;-)
Good breakdown, but it assumes a zero-sum game which the market is not (wealth can be created and destroyed). Its pretty well known in finance circles that HFTs employ obfuscation methods in order to prevent other traders from arb'ing on their strategies. ETFs don’t do this which is why they are getting killed of late. Part of HFT is figuring out which arbs to engage in and the other part is figuring out what arbs the other guy is engaging in and beating him to the punch.
I’ve been kinda nuked by health issues this year, only got halfway to finishing aphyr.com. :-( Lots of folks have asked for a feed; it’s a high priority for me.
No RSS feed?
Bascule, I think this is the best possible outcome. It’ll be really nice for folks who have invested time in learning Node to be able to take advantage of parallelism and keep using the same patterns and libraries. I imagine the work that’s being done on isolates will come into play as well.
Oracle will be shipping an InvokeDynamic-based implementation of JavaScript with JDK8. It shouldn’t be too hard to reimplement the Node I/O and HTTP APIs using Java NIO and run Node.js applications on the JVM once that’s complete.
TJ, it’s great that folks are working on high-performance socket message passing–but I suspect you may not have read the post.
You might want to check this out too https://github.com/visionmedia/super-sockets you’ll get better throughput than node’s built-in json stuff
Regarding node isolates, I think that it would probably be less difficult to port node.js to a better VM than to modify the VM to support real or faked parallelism. Node does a few things right: smart packaging, a rich library of single-purpose tools at hand, a simple-to-understand concurrency model, etc. It also does some important things spectacularly wrong: being single-threaded, inheriting Javascript’s laughable type system, offering syntactically heavy closures as the end-all-be-all of nonlinear programming, etc. Since JS and Dart are single-threaded by specification, I have serious doubts about the semantics of parallel extensions to the core language.
Re: TLB shootdowns, thanks! This is honestly out of my depth, and I’ve read conflicting explanations as to what exactly triggers them. If I understand correctly, modern Linux kernels only need to reach an interprocessor barrier for TLB flushes when memory maps change. I assume that v8 makes relatively few allocations, and watching /proc/interrupts appears to confirm this: there are basically no TLB shootdowns for any of my message-passing examples above.
I agree that stateless servers are boring, but this is a.) good, because boring things are easy to understand, write, and analyze; and b.) many things can be accomplished with stateless components. Because shared state is complicated and expensive, I try to isolate the stateful part of my systems from the stateless ones. The stateless parts may well be good candidates for Node.js. That said, you’re correct to observe that stateless servers invariably offload their state-sharing to another system: a database, queue, internal services, third party APIs, and so on.