How I Have Used Python Professionally

How I Have Used Python Professionally

I am a full-time back-end developer, and my bread and butter is usually C# and Oracle – but I still get chances to use Python all the time. While I prefer the structural elements and solid type-enforcement and debugging of C#, Python comes in handy every time I need an automated workflow that may or may not arise again, but where creating a whole solution seems like overkill.

Writing a Python script can take as little as five minutes, and often enough these tasks involve file system access where Python is just lovely and convenient in my opinion. So, let me talk a bit about the use-cases I have found for Python applications in a professional setting.

And another little benefit is that Python plays very nicely with VS Code, unlike C# where you keep missing ReSharper and other niceties of the full-suite Visual Studio quite a bit. Just open up a new document, write a few lines of code, and run it right from the debugger in VS Code to do the things you want.

Parsing file names and extracting IDs

In my previous job, we often had to deal with a large number of exported documents, that would end up in folders, network drives, this or that and this yet again. It was often confusing, and often enough there would be last-minute changes or simply errors leading to the right file ending up in the wrong folder.

Searching and finding even one of these files manually would take five, ten minutes, and the Windows search isn’t exactly the poster girl of Microsoft’s OS. So naturally, everyone hated these tasks, enough for me to sit down and write a little Python script. It would take a list of input paths and a destination directory, a list of IDs that were included in the file names. It would then loop through all the folders, and if it found the file it would copy it to the destination directory.

This still required a bit of manual setup work, but as you may expect it took minutes instead of hours to find and copy these files.

Parsing PDF files and extracting IDs

Simple things can sometimes become complicated, in the case of the prior example we ended up with not just one or two files, but tens of thousands on a large-scale export. And then, during manual checks we found some errors in the actual text that was being generated, based on faulty input data that we had no way of knowing about before it showed up in the final reports.

So now, we had to re-generate all affected files after fixing the bug, but how do you find out which of these files needed regenerating? We had the affected customer IDs, but those were not appended to the file name this time because they were grouped by date and I believe postal code.

However, regenerating all files would have taken another week, and we had less than a day to spare. So instead, we got together and I wrote a little Python script that went in, opened the actual PDF files, searched the text to see if it included the faulty portion, then copied those files to a folder „to_regenerate“. This process was slooooow compared to simply parsing the file names, but we got together and ran this script on every computer and server we had access to, managing to parse all those files in reasonable time to regenerate the affected files.

Related  I Hate Coffee Machines And People Who Buy Them

Could this have been handled differently? Certainly, and a locally-distributed Python file was probably not the best tool for the job. But we had hours to spare, the actual script took maybe an hour to write and I was in that weird state of being both tired and panicked – and it worked out fine.

Mass-renaming files

Renaming files is often times simple when you need the same pattern applied to all of them. I often use a Windows tool called batch renamer that is quite powerful with all kinds of prefixes, suffixes and even RegEx support – but it is not very flexible.

If you need dynamic components to this renaming process with some if conditions and grouping abilities, then Python is your clear friend. You simply write something like „loop through all files in this folder“ and then a series of conditions that determine whether this particular file needs to be renamed and how.

Combined with RegEx magic this allows some very detailed rulesets, all in a matter of minutes.

Mass-deleting files

Here, we have a split between powershell and Python as my favorite ways to handle mass-deletion. Powershell usually does this pretty well, and fast – I just hate dealing with its syntax. Powershell is almost exactly like PHP, and god knows I still have wartime flashbacks every time I see a $ sign in front of a variable declaration. Also, Powershell has this annoying way of using obscure flags and syntax that you need google to understand things – and Python don’t be like that.

Much like the renaming, you can easily do something like „loop files, check if conditions are met, delete file“. Add a try-catch for when your coworkers are manually looking at the files you’re trying to delete despite your email to not touch that folder, then you save yourself some grey hair and can fix things later that evening when they shut down their computer.

Ordering and zipping files

Zipping files is pretty painful to do if you have to do it more than once or twice. It takes forever, is still a repetitive process, and I break down crying when I have to do it more than enough, but not often enough to warrant an actual automation.

But sometimes, you need to zip hundreds of files in one go, and then it starts making sense to write a little „if file.endswith(fileextension): zip.add(file)“ – and my sanity is saved.

Takeaway: Python serves me well once in a blue moon

While complicated things can be built with Python, and easy things can be automated using C#, I just love the way they both lean into a particular direction and using them for their main strengths gives me warm fuzzy feelings in an otherwise cold, hard reality.



No Comments Yet!

You can be first to comment this post!

Post Reply