Object Detection with an Interactive Telegram Bot
still in progress... btw, | Does AI predict you a person? 😱
Generally, creating an object
detection AI is a long process and for those with no to little experience, a confusing one. Now
we will use some shortcut in the attempt to build an interactive chat agent. The aim is to
make the bot responding a photo-attached message with the detection results.
For that, we will get one (or more) of the available models downloadable at the zoo, which is a collection of trained Neural Networks. With some tailored tensorflow library, they will be
instantly usable for performing object detection tasks.
On top of that, we will code some lines for the telegram bot with the
so-called polling method. The other option is using a webhook, which is far more
efficient but requiring more resources, like a webhosting service. In simple words, polling checks
for new messages regularly while a webhook just waits. Polling is chosen here, since it can
be done easily using your connected PC at home.
A bot is an agent that performs things it is assigned to do, and no more. On instant messaging service like Telegram, a bot can act as a calculator, graphic creator, or a search engine like @wiki does. It can also conduct polls, some games, or act as a weather info provider. A more sophisticated role would be a customer service, if powered with a sufficiently smart NLP capability.
The technique used for detecting objects in the image is called Single Shot Detection, commonly referred as
SSD
. The process is relatively quick, can be almost realtime in processing MP4 files with my home
PC, making it possible to handle a streaming.
However, before actually thinking to go that far, let's
start small with some image detection first. 😉
Installing the telegram bot library is obviously
required, and then followed by some image processing
stuff, including drawing the bounding boxes
to the output image. We can use pip
, conda
or whatever works for our purpose.
pip install python-telegram-bot
pip install pillow
pip install matplotlib
pip install tensorflow
These files will be used for detection purposes. They belong to their respective libraries with some modifications to make it lightweight for deployment:
- visualization_utils.py
- label_map_util.py
- string_int_label_map_pb2.py
- frozen_inference_graph.pb
- mscoco_label_map.pbtxt
- arial.ttf
pb
is a
model file containing all the layers, network and corresponding weights. Its output will require the classes
pbtxt
file to be useful for us. The last one, as you might have guessed, is a font file to
ensure the result easily readable.File number 4 can be downloaded at Tensorflow's site here (142Mb) or here for the lite version (49Mb), while the others are available at my github here. Extract the compressed file from the first link and pick one with the corresponding name. Please put all files in a subdirectory called
imgutils
(if they haven't).
Then, we have
finished this session. 🙂
Let's start with some necessary modules. What you should do before anything else is creating a main file,
let's just call it main.py
, and then put this lines as its very first ones.
file.
from telegram import Bot from telegram.ext import MessageHandler, Filters, Updater, CommandHandler import ssd
Afterwards, we need to initialize the bot along with the updater and dispatcher. To make it sound easy, let's just say that an updater checks regularly for any incoming messages and then a dispatcher handles them.
mybot = Bot('numbers:random_characters') # put your bot's token here! updater = Updater(bot=mybot) dispatcher = updater.dispatcher
And now we will start with the first big thing: putting the detect
function that arranges the
image file and sends them to the SSD algorithm. For the first part, the value of user's id is assigned to
the variable uid
, followed by some cosmetic appearance for typing
.
Then, the second is used to obtain the image file sent by the user. Finally, the third part delivers it to
the SSD and forwards the annotated photo to the user. Optionally, both the original and the result files can
be removed as the process finishes.
def predict(bot, update): # first part: preparation uid = update.message.chat.id bot.send_chat_action(uid, 'typing') # second part: obtain image file = bot.getFile(update.message.photo[-1].file_id) original = 'image-{}.jpg'.format(uid) file.download(original) # third part: process and respond result = ssd.detect(original) bot.send_photo(uid, photo=open(result, 'rb')) # optionally remove the files to keep storage usage low import os os.remove(original) os.remove(result)
Seems easy? 😏
Well, the 🐘 in the room hasn't been mentioned yet. Let's see
how we
handle the incoming images with a short(ened) SSD algorithm.
At this point, you are already able to deploy a functioning bot. Let's use this simple code to test the bot!
def math(bot, update): import re from time import sleep uid = update.message.chat.id text = update.message.text standard_error = 'Not a valid math!' if not re.match(r'^[0-9]+ ?([+]|-|[*]|/) ?[0-9]+$', text): bot.send_message(uid, text=standard_error) return bot.send_chat_action(uid, 'typing') num = [int(x) for x in re.findall("([0-9]+)", text)] op = re.findall("([+]|-|[*]|/)", text)[0] res = sum(num) if op == '+' else num[0] - num[1] if op == '-' else num[0] * num[1] if op == '*' else num[0] / num[1] sleep(0.5) # just to make it a little dramatic... ;) bot.send_message(uid, text=str(res)) dispatcher.add_handler(MessageHandler(Filters.text, math))