Writing Your Own Debugger In Python

Writing Your Own Debugger In Python

For a while now I have been working on games written in a framework called RenPy, a visual novel game engine. It is a lot of fun to work with, but everything that includes branching story lines, image and character declarations and such is a bug fest waiting to happen.

So, one day I decided to write my own debugging tool in addition to the one that already comes with the engine, to notify me of common problems. This tool has saved me hours of time since I built it, and it isn’t all that complicated because RenPy scripts are just text files tha you can parse and loop over. So today, I want to show you how I built this tool and the common errors it checks against.

Parsing the script files in Python

Python is perfect for this task for two reasons:

  • File handling and consumption is just what Python was made for and it takes few lines of code to get the whole text parsing set up.
  • RenPy itself is a superset of Python and thus the two work hand in hand.

Loading the text files into memory is very simple:

def get_text_from_script_file(script):
    with open(script, encoding="utf8") as f:
        scriptlines = f.readlines()
        scriptlines = clean_up_script_file(scriptlines)
        return scriptlines

All that happens inside the clean_up_script_file is some basic text parsing like removing comments (so everything starting with # until the end of the line) – and removing all text between double quotes because the actual text that characters say does not interest me for debugging purposes and might mess with the RegEx.

def clean_up_script_file(text):
    result = []
    line_number = 0

    for line in text:
        # line = line.lstrip() 
        line = re.sub(r'#.*?$','',line) #strip comments                      
        line = re.sub(r'\swith (hpunch|vpunch|fade).*?$','',line) #strip with / transition statements
        line = line.rstrip()
        #line = re.sub(r'".*?"$','""',line) #delete all text inside double quotes to prevent false positives.
        result.append(line)
    return result

With these out of the way, we have loaded all lines of text into an array of strings, and we can now start working on the actual debugger portion.

Loading all declared and referenced variables

The next step is to load all variables, images, audio and character names into memory, and to be exact we need to do this twice. A variable can be either defined or referenced, and we need to find all mismatches between the two. In easier words: If an image file exists in my folder but is never used in the game files this is as much of an error as an image file referenced in the code that does not exist as an image file.

So, let’s stick with variable declarations for now because it is raw text parsing and doesn’t require us to access the file system just now.

def get_all_defined_variables(script_files):
    result = []
    for file in script_files:
        for line in file:
            if re.match(r'default\s(.*?=.*?)$',line):
                result.append(re.search(r'default\s(.*?)=.*?$',line).group(1).strip())
    return result

The line of RegEx checks each line of text for variable declarations, which look like this in RenPy:

default chapter_3_stole_food_from_kitchen = False

In the same way, we can check for image declarations:

#    image declaration: scene chapter_1_go_back_to_your_room_1_c3 

def get_referenced_images_in_label(label_text):
    result = []
    for line in label_text.split('\n'):
        if re.match(r'[\s\t]*?scene\s',line):
            result.append(re.search(r'[\s\t]*?scene\s(.*?$)',line).group(1).strip()) 

and character names look like this:

# character declaration: define man = Character("Man")

def get_referenced_characters_in_label(label_text):
    result = []
    for line in label_text.split('\n'):
        if re.match(r'[\s\t]*?(\w*?)\s"',line): #todo check for dialog
            match = re.search(r'[\s\t]*?(\w*?)\s"',line).group(1)
            if match != '':
                result.append(match) #this takes care of narrator that is always "defined"
    return result

So now, we have all the referenced and defined variables, the only thing that is left is looping through all the images inside the file system folder and grabbing the image names:

def get_all_unused_images(path_to_game_directory,referenced_images):
    result = []
    export = []
    for image_file in os.listdir(os.path.join(path_to_game_directory,'images')):
        image_file = re.sub(r'\..*?$','',image_file)
        if not image_file in referenced_images:
            export.append(image_file)

Above is already part of the actual debugger tool and you can see how easy it is to determine these preventable errors by just running the two lists against each other and checking if they match.

Related  How I Have Used Python Professionally

More complicated debugging hints

So far, everything was pretty straightforward and simple, but I did not want to stop there as this basic functionality is also covered for the most part by the internal linter debugger.

So there were some other common problems that can impact your development, and I wanted to make sure that I could solve them as well. To keep this post a bit shorter I’ll restrict myself to merely listing the test cases here so that you can get an idea of the problems, the actual workflow behind that is always just „loop this, check that“.

  • Images that have resolutions different to the game resolution. RenPy does not auto-scale (on purpose), which can lead to some annoying problems when you render your images in 4k and forget to downsize them (a common workflow during rendering called downsampling for higher pixel density).
  • Images tha are referenced inside the code but do not exist in the file system
  • Images that exist in the file system but are never used in the script (unnecessarily bloated file sizes)
  • The same for audio files.
  • Variables that are defined but not referenced or vice versa. I also added an additional check here to check against capitalization issues and common misspellings to avoid further errors
  • image files that exist in multiple formats, which should not raise errors but it could lead to using the wrong images if that is not intended
  • jumps (scene switches) that have no associated scenes
  • jumps that reference scenes that do not exist
  • scenes that are defined but never called (usually a logic bug)
  • Menus (player choices) that only have one possible choice (popular flaw as you get into the writing flow and forget to add more choices to your ideal path)
  • Menu choices that exist, but are empty (this can be valid control flow, but usually a bug)
  • variable, image and scene names that don’t follow my naming scheme of chapter_x_variable_name in all lowercase. Might as well get in front of annoying typos and bugs, every single one of those can take five to ten minutes to find, figure out and fix.

Takeaway: writing your own debugger can work wonders in the long run

As you can see, there is a lot that can go wrong during the development process of these games, and all of these errors are perfectly preventable in an automated fashion. This script did not take forever to build, and every time I use it I save minutes, up to hours of fixing bugs that can be easily missed.

In the worst case, these bugs only rise up above the surface the moment a player encounters them, and without this tool there is no real way of knowing they exist.

There are more advanced features that I still want to implement at some point, but for now this script is feature complete for my needs and I want to focus on the actual development of the game.

I hope this was interesting to read through and maybe even useful for your own use-cases, because it may feel daunting to write a complete debugging tool for some obscure scripting language that does not already have one.

Related

Comments

No Comments Yet!

You can be first to comment this post!

Post Reply