What is a database transaction?
192 points - today at 12:28 PM
SourceComments
I think it’s more tractable to define this problem space starting from the concept of (strict) serializability, which is really a generalization of the concept of thread safety. Every software engineer has an intuitive understanding of it. Lack of serializability can lead to execution-dependent behavior, which usually results in hard-to-diagnose bugs. Thus, all systems should strive towards serializability, and the database can be a tool in achieving it.
Various non-serializable levels of database transaction isolation are relaxations of the serializability guarantee, where the database no longer enforces the guarantee and it’s up to the database user to ensure it through other means.
The isolation phenomena are a useful tool for visualizing various corner cases of non-serializability, but they are not inherently tied to it. It's possible to achieve serializability while observing all of the SQL phenomena. For example, a Kubernetes cluster with carefully-written controllers can be serializable.
> Under the SQL standard, the repeatable read level allows phantom reads, though in Postgres they still aren't possible.
This is bad wording which could lead to an impression that a repeatable read may show different values. Values in rows will be the same but new rows may be added to the second result set. New rows is important as no previously read rows can be either changed or deleted — otherwise there will be no repetition for those rows second time around.
Using read-committed ofc means having to keep locking details in mind. Like, UNIQUE doesn't just guard against bad data entry, it can also be necessary for avoiding race conditions. But now that I know, I'd rather do that than take the serializable performance hit, and also have to retry xacts and deal with the other caveats at the bottom of https://www.postgresql.org/docs/current/transaction-iso.html
And no I'd never expect people to know the isolation levels by heart, but if you know there are different ones and they behave differntly that's pretty good and tells me you are curious about how things work under the hood.
It goes into not only different isolation levels, but also some ambiguity in the traditional ACID definition.
I believe a 2nd edition is imminent.
Am I missing something or this statement is incomplete? Also I find the visualization of commit weird, it “points to” the header of the table, but then xmax gets updated “behind the scenes”? Isnt xmax/xmin “the mechanism behind how the database knows what is committed/not committed”? Also, there could be subtransactions, which make this statement even more contradictory?
I enjoyed the visualizations and explanations otherwise, thanks!
* the videos should have "pause" and a "step at a time" control *
Even at the "half speed", without a deep knowledge of the context, the videos move way too fast for me to read the syntax that's invoking and line it up with the data on the left side. I (and im definitely not the only one) need to be able to sit on one step and stare at the whole thing without the latent anxiety of the state changing before I've had a chance to grok the whole thing.
this has nothing to do with familiarity with the concepts (read my profile). I literally need time to read all the words and connect them together mentally (ooh, just noticed this is pseudo-SQL syntax also, e.g. "select id=4", that probably added some load for me) without worrying they're going to change before watching things move.
please add a step-at-a-time button!