Object Detection with an Interactive Telegram Bot

still in progress... btw, | Does AI predict you a person? 😱

Generally, creating an object detection AI is a long process and for those with no to little experience, a confusing one. Now we will use some shortcut in the attempt to build an interactive chat agent. The aim is to make the bot responding a photo-attached message with the detection results.

For that, we will get one (or more) of the available models downloadable at the zoo, which is a collection of trained Neural Networks. With some tailored tensorflow library, they will be instantly usable for performing object detection tasks.

On top of that, we will code some lines for the telegram bot with the so-called polling method. The other option is using a webhook, which is far more efficient but requiring more resources, like a webhosting service. In simple words, polling checks for new messages regularly while a webhook just waits. Polling is chosen here, since it can be done easily using your connected PC at home.

About Telegram bot

A bot is an agent that performs things it is assigned to do, and no more. On instant messaging service like Telegram, a bot can act as a calculator, graphic creator, or a search engine like @wiki does. It can also conduct polls, some games, or act as a weather info provider. A more sophisticated role would be a customer service, if powered with a sufficiently smart NLP capability.

Creating a Bot

 

Single Shot Detection

The technique used for detecting objects in the image is called Single Shot Detection, commonly referred as SSD. The process is relatively quick, can be almost realtime in processing MP4 files with my home PC, making it possible to handle a streaming.

However, before actually thinking to go that far, let's start small with some image detection first. 😉

Installation

Installing the telegram bot library is obviously required, and then followed by some image processing stuff, including drawing the bounding boxes to the output image. We can use pip, conda or whatever works for our purpose.

pip install python-telegram-bot
pip install pillow
pip install matplotlib
pip install tensorflow
 
Utilities Files

These files will be used for detection purposes. They belong to their respective libraries with some modifications to make it lightweight for deployment:

  1. visualization_utils.py
  2. label_map_util.py
  3. string_int_label_map_pb2.py
  4. frozen_inference_graph.pb
  5. mscoco_label_map.pbtxt
  6. arial.ttf
The first three files are simplified versions from the tensorflow library. Then, the pb is a model file containing all the layers, network and corresponding weights. Its output will require the classes pbtxt file to be useful for us. The last one, as you might have guessed, is a font file to ensure the result easily readable.

File number 4 can be downloaded at Tensorflow's site here (142Mb) or here for the lite version (49Mb), while the others are available at my github here. Extract the compressed file from the first link and pick one with the corresponding name. Please put all files in a subdirectory called imgutils (if they haven't). Then, we have finished this session. 🙂

 

Let's start with some necessary modules. What you should do before anything else is creating a main file, let's just call it main.py, and then put this lines as its very first ones. file.

from telegram import Bot
from telegram.ext import MessageHandler, Filters, Updater, CommandHandler
import ssd

Afterwards, we need to initialize the bot along with the updater and dispatcher. To make it sound easy, let's just say that an updater checks regularly for any incoming messages and then a dispatcher handles them.

mybot = Bot('numbers:random_characters')  # put your bot's token here!
updater = Updater(bot=mybot)
dispatcher = updater.dispatcher

And now we will start with the first big thing: putting the detect function that arranges the image file and sends them to the SSD algorithm. For the first part, the value of user's id is assigned to the variable uid, followed by some cosmetic appearance for typing.

Then, the second is used to obtain the image file sent by the user. Finally, the third part delivers it to the SSD and forwards the annotated photo to the user. Optionally, both the original and the result files can be removed as the process finishes.

def predict(bot, update):
    # first part: preparation
    uid = update.message.chat.id
    bot.send_chat_action(uid, 'typing')

    # second part: obtain image
    file = bot.getFile(update.message.photo[-1].file_id)
    original = 'image-{}.jpg'.format(uid)
    file.download(original)

    # third part: process and respond
    result = ssd.detect(original)
    bot.send_photo(uid, photo=open(result, 'rb'))

    # optionally remove the files to keep storage usage low
    import os
    os.remove(original)
    os.remove(result)

Seems easy? 😏

Well, the 🐘 in the room hasn't been mentioned yet. Let's see how we handle the incoming images with a short(ened) SSD algorithm.

Simple Math Bot

At this point, you are already able to deploy a functioning bot. Let's use this simple code to test the bot!

def math(bot, update):
    import re
    from time import sleep
    uid = update.message.chat.id
    text = update.message.text
    standard_error = 'Not a valid math!'

    if not re.match(r'^[0-9]+ ?([+]|-|[*]|/) ?[0-9]+$', text):
        bot.send_message(uid, text=standard_error)
        return

    bot.send_chat_action(uid, 'typing')
    num = [int(x) for x in re.findall("([0-9]+)", text)]
    op = re.findall("([+]|-|[*]|/)", text)[0]

    res = sum(num) if op == '+' else num[0] - num[1] if op == '-' else num[0] * num[1] if op == '*' else num[0] / num[1]
    sleep(0.5)  # just to make it a little dramatic... ;)
    bot.send_message(uid, text=str(res))

dispatcher.add_handler(MessageHandler(Filters.text, math))