In the previous blog we have learnt about CTPN algorithm and its architecture in detail. In this blog we will implement this algorithm using its GitHub repository to localize text in an image. We will use Linux operating system to do this.
Clone the Repository
Open a terminal window and clone the CTPN GitHub Repo using following command:
1 |
git clone https://github.com/eragonruan/text-detection-ctpn.git |
Build the Required Library
Non max suppression (NMS) and bounding box (bbox) utilities are written in cython. We need to generate .so file for these so that required files can be loaded into the library. We first need to change current directory to “/text-detection-ctpn/utils/bbox” using following commands:
1 |
cd text-detection-ctpn/utils/bbox |
Now run the following commands to build the library.
1 2 |
chmod +x make.sh ./make.sh |
These commands will generate nms.so and bbox.so in the current directory.
Test the model
Now we can test the CTPN model. To test the model we first need to download the checkpoints. These checkpoints are already provided in the GitHub repository to test the model. You can download the checkpoints from google drive. Now use following steps:
- Unzip the downloaded checkpoints.
- Place the unzipped folder “checkpoints_mlt” in directory ” /text-detection-ctpn”.
- Put your testing images in /data/demo/ folder and your outputs will be generated in /data/res folder.
- Your folder structure will look like follows.
Now run the following command from terminal to test your input images. Change your directory to ” “/text-detection-ctpn” first.
1 2 |
cd /text-detection-ctpn python main/demo.py |
Your output must have been generated on data/res folder. Some of the input and results are shown below.
Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.
Referenced Research Paper: Detecting Text in Natural Image with Connectionist Text Proposal Network
Referenced GitHub Code: text-detection-ctpn
where will i find the text predicted when using this ctpn demo.py?. I could able to only see bbox.
Can we train our own model with CTPN if yes than please expalin how to do that.