I just accomplish an online algorithm test for an interview, one of the problems ask me to travel thought a tree structure and calculate how many paths satisfied the condition. It can be easily solved by traditional DFS algorithm. I answered this (pseudocode):

counter = 0

def dfs(root, current_state):
    # We should not use global variale in production
    # only use it at the interview
    global counter
    if current_state satisified condisition:
        counter += 1
    for node in root.children:
        new_state = current_state + something
        dfs(node, new_state)

def main(root):
    dfs(root, init_state)
    return counter

The code works well and passes all the test cases. Sadly, the reviews form the dev team didn’t like my solution since I use a global variable here. I know that global variable is evil and I never use it in practice (that is why I left some comments in the code). I will likely agree with them that global, eval() may likely lead to bad code. There are multiple ways to avoid using the global variable, the first solution is to use class or nested function (pseudocode):

class Solution:
    def main(root):
        self.counter = 0
        dfs(root, init_state)
        return self.counter 

    def dfs(root, current_state):
        if current_state match satisified:
            self.counter += 1
        for node in root.children:
            new_state = current_state + something
            dfs(node, new_state)

Another solution is to use a stack for iterable instead of recursion.

def main(root):
    counter = 0
    stack = [root, init_state]
    while stack:
        root, current_state = stack.pop()
        if current_state match satisified:
            count += 1
        for node in root.children:
            new_state = current_state + something
            stack.append(node, new_state)
    return counter

We know how to avoid using it. However, should we never use global or eval() in production? I know that it’s not true. If we want to know how to avoiding using them we should also learn how to use them correctly. Understand your enemy, understand him well, right?

What is global variable in Python

If you come from C/C++, the global variable in Python like static variable instead of external variable. It’s only visible inside a py file (we call it module in python). Here is an example of how to use it.

foo = 100
def bar():
    # foo here is global variable
    # even thought we didn't use global keyword
    if foo == 100:
        # the function will return True
        return True
    return False

foo = 100

def bar():
    # foo become local variable
    foo = 10
    if foo == 100:
        return True
    # the function will return False
    return False

foo = 100
def bar():
    # we can't modify the global variable inside a function without 'global' keyword
    # so it will raise an error
    foo += 10
    if foo == 100:
        return True
    return False

foo = 100
def bar():
    # Now it works
    global foo
    foo += 10
    if foo == 100:
        return True
    # the function will return False
    return False

As you can see, the global keyword itself help developers to know which variable is global variable explicitly.

When not to use it

Thread safety and Unit tests can be messed up by using the global variable because lots of functions can modify it. There is a lot of articles tell you why not to use a global variable, like Global Variables Are Bad and Why are global variables evil?.

When to use it

I dug into the CPython source code to see how to use it correctly. There is some situations we can use it.

1. Cache

In zipfile.py/_ZipDecrypter. we use global varialbe _crctable so we don’t have to call the time consuming _gen_crc function repeatly.

_crctable = None
def _gen_crc(crc):
    for j in range(8):
        if crc & 1:
            crc = (crc >> 1) ^ 0xEDB88320
        else:
            crc >>= 1
    return crc

def _ZipDecrypter(pwd):
    key0 = 305419896
    key1 = 591751049
    key2 = 878082192

    global _crctable
    if _crctable is None:
        _crctable = list(map(_gen_crc, range(256)))
    crctable = _crctable
2. Global State

In shutil.py/_USE_CP_SENDFILE, We use _USE_CP_SENDFILE to remember if the current variable has sendfile() attr or not.

_USE_CP_SENDFILE = hasattr(os, "sendfile") and sys.platform.startswith("linux")

def copyfile(src, dst, *, follow_symlinks=True):
    ...
    if err.errno == errno.ENOTSOCK:
        # sendfile() on this platform (probably Linux < 2.6.33)
        # does not support copies between regular files (only
        # sockets).
        _USE_CP_SENDFILE = False
        raise _GiveupOnFastCopy(err)
3. Global data

If we want to share data that should be known by default. In mimetypes.py/_default_mime_types, we use suffix_map to store default suffix of mime_types.

def _default_mime_types():
    global suffix_map, _suffix_map_default
    global encodings_map, _encodings_map_default
    global types_map, _types_map_default
    global common_types, _common_types_default

    suffix_map = _suffix_map_default = {
        '.svgz': '.svg.gz',
        '.tgz': '.tar.gz',
        '.taz': '.tar.gz',
        '.tz': '.tar.gz',
        '.tbz2': '.tar.bz2',
        '.txz': '.tar.xz',
        }
4. Initialization

Just to be clear, most of the time there should be only one function can update the value of the global variable, and other functions just refer it. In gettext/textdomain , only textdomain function can modify _current_domain.

def textdomain(domain=None):
    global _current_domain
    if domain is not None:
        _current_domain = domain
    return _current_domain

Summary

I think global variable is not that evil when you understand how to use it correctly. :D