Thursday 29 January 2015

Released English subbed AKBingo! files Python script renamer


AKBingo! is a weekly variety show of AKB48 which are hosted by the owarai duo, Bad Boys. Sister groups of AKB48 also participate in the show. And oh, I love Japanese variety shows.

As of writing, the show has more than 300 episodes. And there have been different fansub groups who have made subtitles of the different episodes of the show.

Here is a list of the filenames I have collected so far. Most are hardsub files. Some I downloaded via Hello!Online tracker or direct download from the fansub groups, or I ripped them off from streaming sites using youtube-dl or JDownloader if there is no choice.

081029 AKBINGO! episode 5 eng sub - _H264-1280x720.mp4
100303 [Aidol+H!F] AKBINGO! ep73 SP.mp4
130130 [UKN48 x AIDOL] AKBINGO! ep223.mp4
130320 [Aidol+H!F] AKBINGO!.mp4
140311 Emergency investigation -  Is Miki Nishino really hetare_ - AKBINGO!  ep279 [English subtitles]_H264-1280x720.mp4
AKBINGO! episode 137 eng sub - 2011.06.01_H264-1280x720.mp4

Above are some of the sample names of the different files I have. If you observe, the problem here is that they have different naming schemes. I also noticed that I have started renaming some of the files but had the dates wrong. It is rather time consuming and prone to human error to rename the files.

That's why I decided to make a python script to rename the files.

Luckily, one of the fansubbers, Half-san, made a master list of the subbed episodes of AKBingo! variety show. It is a "live" list wherein it is updated regularly whenever newer subbed episodes are available.

After spending almost a day (but more or less 2 nights to be exact), I finished coding this Python script so that it will rename the files' chaotic different filenames.

From this clattered chaotic filenames to...
...this clean and organized filenames
 Here is more or less what the python script does:
  • Scrapes Half-san's masterlist using Beautiful Soup and makes it into JSON objects
  • Compares by either date or episode number from the scrape with the filenames of the video files
  • Renames the file in this format: <Date> - AKBingo! - <ep#> - <Title>.<extension>

The python script I made though have some weaknesses and some are probably bugs.
  • I only tested this in my Linux machine - I only have Linux operating system in my computers. The distribution I used for this is Linux Mint 13 Xfce (Ubuntu 12.04 derivative).
  • It runs in Python and Beautiful Soup for Python must also be installed. There are some extra steps needed if you have other operating systems besides Linux (which more popular distributions have the programming language pre-installed)
  • It has to be run at least twice - you will notice that at lines 102 and 103 of the script where you will comment out per go (ie: after first run, comment line 102 then uncomment line 103)
  • It doesn't seem to rename some of the files in one go even if you just used only one of either line 102 or line 103 so I had to run the script again for the second time
  • It throws an error (or just mysteriously deletes the file) if there are same files but of different filenames

Below are some of the things I want to add to the script in the future.
  • Add fansub group tags at the end - I will use this masterlist to get the name of the fansub group per episode file
  • Enable to add link reference keys in the filenames - Hello!Online tracker, DailyMotion, Youtube have easier reference keys of the links. Or I can just automate some of the links via goo.gl URL shortener

I also plan to make another python script which checks what episodes I don't have so far that has already been subbed and is available.

I probably will test and edit this tutorial if you are in a Windows machine.

Anyway, here is the python script I made. Before using it, you have to specify the working directory at line 122. Feel free to edit to your liking, especially the naming scheme at line 112.

If you have questions, comments or corrections, feel free to post at the Disqus comments section.