It looks like you're new here. If you want to get involved, click one of these buttons!
Title: Master/Slave Replication
Author/s: Unknown
Language/s: SQL
When I was invited to edit a section of the Responsible Communication Style Guide, one of the terms we added to our “Terms to Avoid” list was "master/slave" to describe primary and secondary nodes. In the volume, we suggest using the alternative “primary/replica”; these are terms I continue to use whenever I come across this replication structure.
The web framework Django and the open-source CMS Drupal formally replaced master/slave with primary/replica back in 2014 (Conversation in the Django project and Drupal release notes) Other open-source projects have since then opted for similar alternatives.
The relational database MySQL is among a few open-source projects that have retained “master/slave” terms. In MySQL, master/slave replication refers to configuring a database such that the contents of one database (a “master”) can be automatically copied to others (or “slaves”).
Here are the instructions for configuring the “master” database in a “slave” database:
mysql> CHANGE MASTER TO
-> MASTER_HOST='master_host_name',
-> MASTER_USER='replication_user_name',
-> MASTER_PASSWORD='replication_password',
-> MASTER_LOG_FILE='recorded_log_file_name',
-> MASTER_LOG_POS=recorded_log_position;
Once configured, one can start the “slave” by running the following:
mysql> START SLAVE;
(Source: "Setting Up Replication Slaves", MySQL Documentation)
From what I've gathered arguments for and against changing the terminology come down to context: one group argues that since these terms do not refer to the legacy of American slavery nor to an oppressive relationship between humans, these terms should not change (See: “On Redis master-slave terminology” by Redis creator Salvatore Sanfilippo). Another group argues that as our understanding of language evolves, so should the terms we use, especially when alternatives may be both inclusive and more accurate (See: “Let’s Stop Saying Master/Slave” by Mike Roberts).
With Django and Drupal, community-driven conversations helped precipitate the terminology change. I’ve searched for examples of community conversations from MySQL users where this has been debated, but have not found any as of yet.
Questions to consider:
I welcome your thoughts!
Comments
Great questions!
I'm curious about how this relates to the world of noSQL databases and others.
For example, I see that MongoDB deprecated "Master-Slave" by its 3.4 documentation and then removed it completely in its 4.0 release notes -- but (I believe...?) this was primarily framed as moving away from the feature / architecture, and towards peer replica sets, not about changes the vocabulary. CouchDB and PouchDB etc. could say that they did not use "master-slave" then as a question of design -- but they still used (still use?) those terms rather than "primary/replica" to refer to the architecture they don't have: "CouchDB is not master-slave." Some of these also use derived phrases like "multi-master" to refer to implicitly "slave"-less clusters (as opposed to multi-primary, multi-lead, etc.) -- so the word slave is gone, but the concept may remain a part of the form.
It seems like documentation for large centralized projects is one great intervention point, but that things like Wikipedia and StackOverflow are important as well -- we learn to speak about and think about software socially.
APIs play an especially important role here, as they terms get bound into protocol, and those protocols can become infrastructure and tradition.
For example, I'm assuming that MariaDB could not (or would not) rename most of its Master / Slave documentation until MySQL does, as they are attempting to support API compatibility including the same flavor of SQL that MySQL uses -- and that necessitates the same vocabulary / keywords. Of course, projects could extend their APIs with synonyms so that the underlying forms were the same, but those terms were available only as a question of legacy support....
it seems like we are getting back to the notion that communication is collaborative. a text both reflects and creates its cultural context. language we are already familiar with facilitates fast understanding, but only because it bootstraps us with context -- and that context comes with its own baggage. couchdb relies on a problematic metaphor to ground their framework in a model that the developer already understands; instead of building a picture of a completely new model, they import an existing one and modify it. updating the language around a metaphor seems like a similar process to me: take a model you already understand and modify it to arrive at a new model.
m/s model => multi-m model
m/s model => primary/replica model
from this perspective, i understand why they fail to use updated terms. it adds another round of model modification, which anyone writing documentation would shun.
m/s model => primary/replica model => multi-primary model
their imagined user can only bootstrap the context on the very far left of these chains.
"it would complicate the documentation" does not to me feel like a justification for relying on problematic language. i would much rather have complicated documentation that hurts no one than documentation that is straightforward and problematic. i see this as an example of how the culture around [code, but only as an example of an insular group] self-reinforces, -amplifies, and -normalizes problematic attitudes that already exist.
this discussion also reminds me of glimpse, which is a fork of the gnu image manipulation program that was started to gain traction for updating a problematic acronym: https://glimpse-editor.org/about/#what-is-wrong-with-the-gimp-name
some questions:
1/ how can we leverage group dynamics to self-critique and break down problematic structures instead of reinforcing them?
2/ to what extent is problematic language symptomatic of a problematic approach to problem solving? e.g. how would/does our approach to networking change if we abandoned the concept of node hierarchy in addition to its language, as in the couchdb multi-primary model?
I'm very interested in this discussion. Working in audio these terms are still prevalent (talking about master clock, etc.).
I'm curious as to the origin of "Master/Slave" in this technical context. Were these terms (in regard to code) coined in the US? Are they originally from English? Is there a way to trace their history? (How) Does the history matter to CCS?
@SimonHutchinson,
A quick search of in Google Books turns up the following from 1951:
That's in Electrical Engineering, Volume 70, part 1, page 424.
Contemporaneous to that are a few other technical uses:
From Diesel-electric Locomotive Handbook: Mechanical equipment, p100, 1951.
Here's the Google n-Gram take on those two:
I'm confident the technological use preceded computers, but without a deeper dive that's all I'll commit to.
@barry.rountree Thanks!
Poking around, in this 2007 article, the earliest mention the author found was 1904. I haven't dug into this whole article yet, but hope to do so later in the week.
I bring up the history and origins of this usage in regard to the original question of how to best encourage change. Perhaps a way to refute "That's the way we've always done it" is to demonstrate that this usage was a decision that was made at some point, and we have the capacity to make a new decision to change.
From Ron Eglash's article above, p 368:
@SimonHutchison: nice catch!
Thank you for bringing up Glimpse! I'd been considering proposing the code diff of the first release as a discussion thread, but with over 1800 files edited, it's difficult figure out a good way to explore what all was changed.
This also highlighted for me an adjacent concern I had regarding discussing Glimpse: shouldn't Disability Studies (which I have no background in) be centered in the discussion thread, were CCSWG20 to have one?
Regarding initiating terminology changes, It seems important to balance a desire to be proactively considerate i.e. not to to place the entire burden of highlighting the need for/initiating change on members of the vulnerable groups that are often targeted with these kinds of microaggressions, yet it is certainly important that the experiences of the those targeted are centered. @smorillo it seems like this is a balance you may have navigated when producing "The Responsible Communication Style Guide"?
At the other end of the spectrum could be a bot that blindly finds code on GitHub with the words to be updated and proposes pull requests replacing them with the same capitalization. There is a likelihood of false positives, such as code that contains descriptions of historical events, or even the code for the bot itself. On the other hand it is non-binding and the cost is the time a human must spend to discard the request. I've seen similar bots for spelling and grammar making unsolicited requests in the past both correctly and incorrectly. Right or wrong, in part I would frame using a bot to update terminology as an attempt to avoid the emotional discomfort of conflict. However it also has an element of normalizing, or standardizing, language like a spell check or a style manual.
Or @smorillo were you getting at a question more of 'who has a right' to make these kinds of requests to a project? Particularly in an Open Source context it would seen hard to justify a claim that one must be a direct user or a developer who has previously contributed. But speaking more generally, there is often a question of stake when change requests are made.
The developers of Glimpse do make an effort to point out the the fork was a result of an unwillingness to change on the part of the original project, and they uphold both end users and "educational institutions and public libraries" as the parties for whom this change is being made (rather than personal taste). This likely tweaks a deeper discomfort in the original community regarding why their free software hasn't been more successful at displacing the expensive Adobe Photoshop. They also go to great lengths to point out that there will continue to be code sharing between the projects (in both directions) and that a portion of their donations get re-contributed to the original project.
While this certainly does a lot to downplay conflict, it also means that the original project will be receiving a quarterly check reminding them what value the community places on this issue (among others; there are some broader user-friendliness goals to Glimpse).
So bringing it back to 'who has a right' there is always the open source right-to-fork (start another prjects sharing the same code), and often forks are a place where changes get floated that eventually get upstreamed. This can sometimes address reluctance on the part of a project because the changes are already available to be applied. @jeremydouglass's mention of MariaDB emphasizes the extra burden that the fork project can bear in the meantime, however.
I think this attention to language @smorillo is so important. It also reminds me of structuring social interaction and protocol in general. The recurse center rules come to mind. https://www.recurse.com/social-rules.
The best case would be to have a standards committee that maintained a public repo on these terminologies that would allow people to add issues, PRs, etc. I think this would be amazing if it were adopted.
Hegel put an influential (see e.g. Recognition in SEP) spin on the dialectic of master and slave in early 19th century, but more topically for CCSWG the philosopher of technology Mark Coeckelbergh picked it up in the paper The tragedy of the master: automation,vulnerability, and distance published in Ethics and Information Technology (2015), and discussed it in episode 21 of Philosophical Disquisitions.
Master is of course a lot of things (e.g. an achievement in formal education (MSc, MA)), and mastery is not an issue – slavery is. It's in the dialectic
Anyway the way @smorillo shares the question of if, when and how to bring up these terminological changes is super interesting, and it is absolutely right that more accurate vocabulary is available. I think in this database case "master" and "slave" are even misleading. I can imagine how "master" might have arrived from e.g. audio like @SimonHutchinson says, but in the problematic notion of master/slave it is not so that the slave is alike the master – exactly the opposite is the case.