In this tutorial, we will present a very simple application built using the latest Tensorflows Object Detection API. The object detection API is as of now, the most easiest methods to build computer vision applications for image recognition.
At our company Redeem Systems, we have large amounts of expertise in the field of engineering and networking as well as analytics. We are constantly building machine learning applications to solve real world problems that tend to improve the lives of people and would move us towards a better world.
A note for advanced users. You may skip some of the parts of this tutorial, wherever you feel that I have taken it very slowly. The idea of this blog post is to make even beginners get a clear understanding of what’s going on and how it all works.
Without further delay, lets jump straight into the problem defining and what we are going to do. In this tutorial, we are going to use the tensorflow object detection API and create a small application in Python and Python GUI application creator Tkinter. Though this blog is quite easy to understand, I would still recommend that you go through Tensorflow and Object Detection API documentation and also their github page to get a clear understanding of how it works.
Python is probably the most important language to learn because of its rich ecosystem. Python’s major advantage is its breadth. For example, R can run Machine Learning algorithms on a preprocessed dataset, but Python is much better at processing the data.
I have used Python 2.7.* version, and I would recommend that you do not go for Python 3 as not only does the syntax differ, but that Python 3 has much more number of libraries
Programming IDE(optional, you can also use some editor for coding): PyCharm Community Edition
This is not particularly needed, unless you are just beginning with Python, then Pycharm will give you a good start in understanding some of the syntax errors that you are making and it helps maintain your plugins better. Download the installer for your OS from the link below
Machine Learning Library: Tensorflow
This is a little bit tricky to install for first timers as it will ask you to install several dependencies. You might also need the pip installer which comes by default with Python, but you might need to update it. Anyways coming back to tensorflow, it’s a bit of a hazzle but it’s definitely worth it if you are done with the installation. Ok, new update, it’s quite easy to install tensorflow right now as compared to how it was before( I spent 4-5 hours installing if i remember correctly before @.@), using the pip package installer. You only need run one command, and it will do all of the work for you. Anyways still go through the installation instructions as some of the dependencies might not get properly installed.
Python libraries : Tkinter, MoviePy, Numpy and matplotlib
You need to install Tkinter GUI package. It’s usually pre-installed or you can just do it using a single command on your terminal. You will also need to install a few more additional packages like ‘moviepy’ for video editing and other things. Also Numpy and matplotlib for numerical calculations and plotting.
#Type this command in your terminal sudo pip install tkinter sudo pip install moviepy sudo pip install numpy matplotlibInstall GIT
You will also need git for cloning a few repositories. There are different methods to install git for different OS. It’s quite easy to install
Since, you have gotten this far. Let’s get started by cloning the Tensorflow models repository using git.
git clone https://github.com/tensorflow/models
Here you will find a folder by the name of object_detection. You will notice that there is already an existing iPython notebook(object_detection_tutorial.ipynb). This is a very good example of how to use the object detection API. We are going to modify this same file and add our own code for the Object detection in videos, and finally converting it into a very minimal GUI application in tkinter. So, most of the code in the ipynb notebook will be similar. Except the fact that, we will be adding our own code at the end, such that it takes a video and classifies objects in the videos. Please note that, you might need to get a basic understanding of how the Tensorflow Object Detection API works. I’m currently using the lightest model which is also used in the tutorial Ipython Notebook. It is suggested that for new users, it is better to clone the tensorflow models repository and place your code there itself, so that you don’t have any problems in the environment setup. You also need to save the Ipython Notebook code as a .py file, and in order to do that, you will only need to remove this single line from the ipynb notebook
So, here is the code that needs to be added to the .py file we created( please make sure, you clone tensorflow models repository and save the .py folder to the same folder as the .ipynb notebook so that, you dont have any dependency nightmares) ,
import sys #You have to give the directory of video file via command line arguments = sys.argv[1:] for x in sys.argv[1:]: video_x = x from moviepy.editor import VideoFileClip def process_image(image): with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: image_process = detect_objects(image, sess, detection_graph) return image_process white_output = 'output_video/video_out.mp4' #subclip here lets you decide the timeline in the video that you want to use. subclip(0,8( means 0 to 8 seconds) clip1 = VideoFileClip(video_x).subclip(0,8) white_clip = clip1.fl_image(process_image) #NOTE: this function expects color images white_clip.write_videofile(white_output, audio=False)
Now, after saving this file as .py file. In my case, I have saved it as test2.py. Now, we create another tkinter.py file that we will be the GUI that we will use to upload the video file and do the classification on it. The code for the tkinter GUI is also very easy to understand. The code that I’m currently showing to you is only a small snapshot, only to get you started and understand. You can contact me directly, via the comments if you want to understand more,
from Tkinter import * # if you are working under Python 3, comment the previous line and comment out the following line #from tkinter import * import tkFileDialog import os import time class App: def __init__(self, master): w1 = Label(root, justify=CENTER, text="VIDEO OBJECT CLASSIFIER") w1.pack() frame = Frame(master) frame.pack() self.button = Button(frame, text="Quit", fg="red", command=quit) self.button.pack(side=LEFT) self.slogan = Button(frame, text="Upload", command=self.browse_video) self.slogan.pack(side=LEFT) def browse_video(self): fname = tkFileDialog.askopenfilename(filetypes=(("Template files", "*.mp4"), ("All files", "*"))) root.destroy() print fname os.system('python test2.py ' + fname) root = Tk() app = App(root) root.mainloop()
We have optimized the code here, such that when a new tab or python GUI instance starts, the other should close. If this is isn’t done, then the particular tab will not be operable, but will just be stuck on your screen. Another way to fix it would be to run the new GUI instance exactly on top of the old one, so it’s not visible. In this case, we don’t really need the older instance and hence, we close it using
Though this application needs you to run the tkinter.py file everytime from terminal and has a very minimal GUI, which can be made better. Since it pretty much serves my purpose and tags the objects in the videos, and the idea was to show a very minimal app. I will come up with a complete GUI in a few days. For now, here is the output for the video that I uploaded