Private Name Mangling
python attempt for name collision, private etc

Introduction
Came across this post on DE subreddit. This post is about private name mangling in Python, the link is here.
Python doesn't support strict private variables like other languages such as C++. However, python has a way to "implement" private variables by name mangling. It has everything to do with the number of leading underscore _ or __
| - | description | internal in `__dict__` |
self.public | all can access | public |
self._private | _ is a friendly hint to another programmer that this variable is private but doesn't enforce rule. You can still access it. | _private |
self.__protected | __ is very private name mangling happens. Python will store any variable starting with two leading underscores __variable in the form of _ClassName__variable with a prefix _ClassName | _classname__protected |
The analogy of public, private and protected, borrowed from C++
Example 1 public, private and protected
Free to play with the following code snippet to explore the difference between public, _private and __protected variable. and how it's handled in Python.
class Test:
def __init__(self) -> None:
# use of some c++ lingo
self.public = 11
self._private = 23
self.__protected = 42
def __private_method(self):
print("private method")
if __name__ == "__main__":
t = Test()
print(t.__dict__)
print(f"_private variable: {t._private}")
print(f"__protected variable: {t._Test__protected}")
t._Test__private_method()
The output of the script is
{'public': 11, '_private': 23, '_Test__protected': 42}
_private variable: 23
__protected variable: 42
private method
you can see there is no __protect attribute in the namespace of the instance. However, you can still access it by t._Test__protected.
Motivation
The reason behind this feature is that they wish to avoid name collision when inheritance. As the project gets larger or works on other people's codebase, for example, it is inevitable to name collision between parent and child class.
Example 2: inspect the __dict__
Let's have a class Class to illustrate the concept
class Class:
def __init__(self) -> None:
self.__student_count = 0
def get_student_count(self):
return self.__student_count
def set_student_count(self, count):
self.__student_count = count
if __name__ == "__main__":
c = Class()
# snapshot 1
print(c.__dict__)
# snapshot 2
c.set_student_count(23)
print(c.__dict__)
# snapshot 3
c.__student_count = 10
print(c.get_student_count())
print(c.__dict__)
The output is
{'_Class__student_count': 0}
{'_Class__student_count': 23}
23
{'_Class__student_count': 23, '__student_count': 10}
When you try to set the variable __student_count with the setter method, it works as expected. However, when you try to set it directly, it doesn't work. It's because python will store any variable starting with two leading underscores __variable in the form of _ClassName__variable with a prefix _ClassName. It is illustrated in the __dict__ of the instance.
Example 3: class and math class
Let's say we have two classes,
Class: a class with a private variable__count, written by author 1 foo. He wants to keep track of the number of students in the class.MathClass: a class that inherits fromClassand has a private variable__countas well,, written by author 2 bar. He wants to keep track of the number of textbook used for the math class.
author 1 left the job and author 2 inherit the class Class and name his own class MathClass. He wants to use __count as well but to count completely different things. He will create his own setter and getter method for __count as well. A code snippet is shown below.
class Class:
def __init__(self) -> None:
# author 1: foo
# number of students in the class
self.__count = 0
def get_count(self):
return self.__count
def set_count(self, count):
self.__count = count
class MathClass(Class):
def __init__(self) -> None:
super().__init__()
# author 2: bar
# number of textbook used for the math class
self.__count = 10
def get_count(self):
return self.__count
def set_count(self, count):
self.__count = count
if __name__ == "__main__":
c = Class()
math_c = MathClass()
print(c.__dict__)
print(math_c.__dict__)
math_c.set_count(20)
print(c.__dict__)
print(math_c.__dict__)
Output is here. It works fine.
{'_Class__count': 0}
{'_Class__count': 0, '_MathClass__count': 10}
{'_Class__count': 0}
{'_Class__count': 0, '_MathClass__count': 20}
But imagine if there is no name mangling feature in Python to treat __count as _<ClassName>__count. The output will be
{'__count': 0}
{'__count': 10}
{'__count': 0}
{'__count': 20}
You will accidentally overwrite the variable __count in the parent class but it stands for different meaning in the parent class. This is the reason why Python has this feature.
Summary
In this section, we touched upon
private, public and protected variables in python
name mangling in python with example
private name mangling is a kinda debatable feature. It's Python's effort to adopt more statically typed features from other languages. It's not a perfect solution but it's a solution. It's a trade-off between flexibility and safety.
This feature acts as a fail-safe for programmers to make mistakes. Also, it's advocates for better naming if we change it to
self.__countinclass Classtoself.student_countself.__countinclass MathClasstoself.textbook_count
It's more clear and less confusing and you should put more thought into naming things to be more pragmatic. It echos there are two hard things in computer science: cache invalidation, and naming things.



