ESP32-CAM Web Server with OpenCV.js: Color Detection and Tracking

This guide introduces OpenCV.js and OpenCV tools for the ESP32 Camera Web Server environment. As an example, we’ll build a simple ESP32 Camera Web Server that includes color detection and tracking of a moving object.

ESP32-CAM Web Server with OpenCV.js: Color Detection and Tracking

This tutorial is by no means an exhaustive treatment of all that OpenCV can offer to ESP32 camera web servers. It is expected that this introduction will inspire additional OpenCV work with the ESP32 cameras.

This project/tutorial was created based on Andrew R. Sass project and edited by Sara Santos.

Introduction

The ESP32 can act as a server for a browser client and some models include a camera (for example, ESP32-CAM) which allows the client to view still or video pictures in the browser. HTML, JavaScript, and other browser languages can take advantage of the extensive capabilities of ESP32 and its camera.

For those who have little or no experience with the ESP32 camera development boards can start with the following tutorial.

OpenCV.js

As described in docs.opencv.org, “OpenCV.js is a JavaScript binding for a selected subset of OpenCV functions for the web platform”. OpenCV.js uses Emscripten, a LLVM-to-JavaScript compiler, to compile OpenCV functions for an API library which continues to grow.

OpenCV.js

OpenCV.js runs in a browser which allows rapid trial of OpenCV functions by someone with only a modest background in HTML and JavaScript. Those with a background in Esp32 Camera applications have this background already.

Project Overview

The project we’ll build throughout this tutorial creates a web server that allows color tracking of a moving object. On the web server interface, you play with several configurations to properly select the color you want to track. Then, the browser sends the real time x and y coordinates of the center of mass of the moving object to the ESP32 board.

ESP32-CAm Color Tracking OpenCVJS Project Overview

Here’s a preview of the web server.

ESP32-CAM Color Tracking Web Server Preview

Prerequisites

Before proceeding with this project, make sure you follow the next pre-requisites.

Arduino IDE

We’ll program the ESP32 board using Arduino IDE. So, you need the Arduino IDE installed as well as the ESP32 add-on:

VS Code (optional)

If you prefer to use VS Code + PlatformIO to program your board, you can follow the next tutorial to learn how to set up VS Code to work with the ESP32 boards.

Getting an ESP32 Camera

This project is compatible with any ESP32 camera board that features an OV2640 camera. There are several ESP32 camera models out there. For a comparison of the most popular cameras, you can refer to the next article:

Make sure you know the pin assignment for the camera board you’re using. For the pin assignment of the most popular boards, check this article:

Code – ESP32-CAM with OpenCV.js

The program consists of two parts:

  • the server program which runs on the ESP32 Camera
  • the client program which runs on the Chrome browser

The program is split into two files: the OCV_ColorTrack_P.ino file containing the server program and the index_OCV_ColorTrack.h header file containing the client program (HTML, CSS and JavaScript with OpenCV.js).

Create a new Arduino sketch called OCV_ColorTrack_P and copy the following code.

/*********
  The include file, index_OCV_ColorTrack.h, the Client, is an intoduction of OpenCV.js to the ESP32 Camera environment. The Client was
  developed and written by Andrew R. Sass. Permission to reproduce the index_OCV_ColorTrack.h file is granted free of charge if this
  entire copyright notice is included in all copies of the index_OCV_ColorTrack.h file.
  
  Complete instructions at https://RandomNerdTutorials.com/esp32-cam-opencv-js-color-detection-tracking/
  
  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files.
  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
*********/

#include <WiFi.h>
#include <WiFiClientSecure.h>
#include "esp_camera.h"
#include "soc/soc.h"
#include "soc/rtc_cntl_reg.h"
#include "index_OCV_ColorTrack.h"

// Replace with your network credentials
const char* ssid = "REPLACE_WITH_YOUR_SSID";
const char* password = "REPLACE_WITH_YOUR_PASSWORD";
 
String Feedback="";
String Command="",cmd="",P1="",P2="",P3="",P4="",P5="",P6="",P7="",P8="",P9="";
byte ReceiveState=0,cmdState=1,strState=1,questionstate=0,equalstate=0,semicolonstate=0;
//ANN:0
//       AI-Thinker                    
#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27
#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

WiFiServer server(80);
//ANN:2
void ExecuteCommand() {
  if (cmd!="colorDetect") {  //Omit printout
    //Serial.println("cmd= "+cmd+" ,P1= "+P1+" ,P2= "+P2+" ,P3= "+P3+" ,P4= "+P4+" ,P5= "+P5+" ,P6= "+P6+" ,P7= "+P7+" ,P8= "+P8+" ,P9= "+P9);
    //Serial.println("");
  }
  
  if (cmd=="resetwifi") {
    WiFi.begin(P1.c_str(), P2.c_str());
    Serial.print("Connecting to ");
    Serial.println(P1);
    long int StartTime=millis();
    while (WiFi.status() != WL_CONNECTED) 
    {
        delay(500);
        if ((StartTime+5000) < millis()) break;
    } 
    Serial.println("");
    Serial.println("STAIP: "+WiFi.localIP().toString());
    Feedback="STAIP: "+WiFi.localIP().toString();
  }    
  else if (cmd=="restart") {
    ESP.restart();
  }
  else if (cmd=="cm"){
    int XcmVal = P1.toInt();
    int YcmVal = P2.toInt();
    Serial.println("cmd= "+cmd+" ,VALXCM= "+XcmVal);
    Serial.println("cmd= "+cmd+" ,VALYCM= "+YcmVal);   
  }
  else if (cmd=="quality") { 
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_quality(s, val);
  }
  else if (cmd=="contrast") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_contrast(s, val);
  }
  else if (cmd=="brightness") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt();  
    s->set_brightness(s, val);  
  }   
  else {
    Feedback="Command is not defined.";
  }
  if (Feedback=="") {
    Feedback=Command;
  }
}

void setup() {
  WRITE_PERI_REG(RTC_CNTL_BROWN_OUT_REG, 0);
  
  Serial.begin(115200);
  Serial.setDebugOutput(true);
  Serial.println();

  camera_config_t config;
  config.ledc_channel = LEDC_CHANNEL_0;
  config.ledc_timer = LEDC_TIMER_0;
  config.pin_d0 = Y2_GPIO_NUM;
  config.pin_d1 = Y3_GPIO_NUM;
  config.pin_d2 = Y4_GPIO_NUM;
  config.pin_d3 = Y5_GPIO_NUM;
  config.pin_d4 = Y6_GPIO_NUM;
  config.pin_d5 = Y7_GPIO_NUM;
  config.pin_d6 = Y8_GPIO_NUM;
  config.pin_d7 = Y9_GPIO_NUM;
  config.pin_xclk = XCLK_GPIO_NUM;
  config.pin_pclk = PCLK_GPIO_NUM;
  config.pin_vsync = VSYNC_GPIO_NUM;
  config.pin_href = HREF_GPIO_NUM;
  config.pin_sscb_sda = SIOD_GPIO_NUM;
  config.pin_sscb_scl = SIOC_GPIO_NUM;
  config.pin_pwdn = PWDN_GPIO_NUM;
  config.pin_reset = RESET_GPIO_NUM;
  config.xclk_freq_hz = 20000000;
  config.pixel_format = PIXFORMAT_JPEG;
  //init with high specs to pre-allocate larger buffers
  if(psramFound()){
    config.frame_size = FRAMESIZE_UXGA;
    config.jpeg_quality = 10;  //0-63 lower number means higher quality
    config.fb_count = 2;
  } else {
    config.frame_size = FRAMESIZE_SVGA;
    config.jpeg_quality = 12;  //0-63 lower number means higher quality
    config.fb_count = 1;
  }
  
  // camera init
  esp_err_t err = esp_camera_init(&config);
  if (err != ESP_OK) {
    Serial.printf("Camera init failed with error 0x%x", err);
    delay(1000);
    ESP.restart();
  }

  //drop down frame size for higher initial frame rate
  sensor_t * s = esp_camera_sensor_get();
  s->set_framesize(s, FRAMESIZE_CIF);  //UXGA|SXGA|XGA|SVGA|VGA|CIF|QVGA|HQVGA|QQVGA  設定初始化影像解析度
     
  WiFi.mode(WIFI_AP_STA);
  WiFi.begin(ssid, password);   

  delay(1000);

  long int StartTime=millis();
  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    if ((StartTime+10000) < millis()) 
      break;   
  } 

  if (WiFi.status() == WL_CONNECTED) {   
    Serial.print("ESP IP Address: http://");
    Serial.println(WiFi.localIP());  
  }
  server.begin();          
}



void loop() {
  Feedback="";Command="";cmd="";P1="";P2="";P3="";P4="";P5="";P6="";P7="";P8="";P9="";
  ReceiveState=0,cmdState=1,strState=1,questionstate=0,equalstate=0,semicolonstate=0;
  
  WiFiClient client = server.available();

  if (client) { 
    String currentLine = "";

    while (client.connected()) {
      if (client.available()) {
        char c = client.read();             
        
        getCommand(c);
                
        if (c == '\n') {
          if (currentLine.length() == 0) {    
            
            if (cmd=="colorDetect") {
              camera_fb_t * fb = NULL;
              fb = esp_camera_fb_get();  
              if(!fb) {
                Serial.println("Camera capture failed");
                delay(1000);
                ESP.restart();
              }
              //ANN:1
              client.println("HTTP/1.1 200 OK");
              client.println("Access-Control-Allow-Origin: *");              
              client.println("Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept");
              client.println("Access-Control-Allow-Methods: GET,POST,PUT,DELETE,OPTIONS");
              client.println("Content-Type: image/jpeg");
              client.println("Content-Disposition: form-data; name=\"imageFile\"; filename=\"picture.jpg\""); 
              client.println("Content-Length: " + String(fb->len));             
              client.println("Connection: close");
              client.println();
              
              uint8_t *fbBuf = fb->buf;
              size_t fbLen = fb->len;
              for (size_t n=0;n<fbLen;n=n+1024) {
                if (n+1024<fbLen) {
                  client.write(fbBuf, 1024);
                  fbBuf += 1024;
                }
                else if (fbLen%1024>0) {
                  size_t remainder = fbLen%1024;
                  client.write(fbBuf, remainder);
                }
              }    
              esp_camera_fb_return(fb);                        
            }
            else {
              //ANN:1
              client.println("HTTP/1.1 200 OK");
              client.println("Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept");
              client.println("Access-Control-Allow-Methods: GET,POST,PUT,DELETE,OPTIONS");
              client.println("Content-Type: text/html; charset=utf-8");
              client.println("Access-Control-Allow-Origin: *");
              client.println("Connection: close");
              client.println();           
              String Data="";
              if (cmd!="")
                Data = Feedback;
              else {
                Data = String((const char *)INDEX_HTML);
              }
              int Index;
              for (Index = 0; Index < Data.length(); Index = Index+1000) {
                client.print(Data.substring(Index, Index+1000));
              }        
              client.println();
            }
                        
            Feedback="";
            break;
          } else {
            currentLine = "";
          }
        } 
        else if (c != '\r') {
          currentLine += c;
        }
        if ((currentLine.indexOf("/?")!=-1)&&(currentLine.indexOf(" HTTP")!=-1)) {
          if (Command.indexOf("stop")!=-1) {  
            client.println();
            client.println();
            client.stop();
          }
          currentLine="";
          Feedback="";
          ExecuteCommand();
        }
      }
    }
    delay(1);
    client.stop();
  }
}

void getCommand(char c){
  if (c=='?') ReceiveState=1;
  if ((c==' ')||(c=='\r')||(c=='\n')) ReceiveState=0;
  
  if (ReceiveState==1) {
    Command=Command+String(c);    
    if (c=='=') cmdState=0;
    if (c==';') strState++;
    if ((cmdState==1)&&((c!='?')||(questionstate==1))) cmd=cmd+String(c);
    if ((cmdState==0)&&(strState==1)&&((c!='=')||(equalstate==1))) P1=P1+String(c);
    if ((cmdState==0)&&(strState==2)&&(c!=';')) P2=P2+String(c);
    if ((cmdState==0)&&(strState==3)&&(c!=';')) P3=P3+String(c);
    if ((cmdState==0)&&(strState==4)&&(c!=';')) P4=P4+String(c);
    if ((cmdState==0)&&(strState==5)&&(c!=';')) P5=P5+String(c);
    if ((cmdState==0)&&(strState==6)&&(c!=';')) P6=P6+String(c);
    if ((cmdState==0)&&(strState==7)&&(c!=';')) P7=P7+String(c);
    if ((cmdState==0)&&(strState==8)&&(c!=';')) P8=P8+String(c);
    if ((cmdState==0)&&(strState>=9)&&((c!=';')||(semicolonstate==1))) P9=P9+String(c);   
    if (c=='?') questionstate=1;
    if (c=='=') equalstate=1;
    if ((strState>=9)&&(c==';')) semicolonstate=1;
  }
}

View raw code

Save that file.

index_OCV_ColorTrack.h

Then, open a new tab in the Arduino IDE as shown in the following image.

Arduino IDE Create a New Tab

Name it index_OCV_ColorTrack.h.

Arduino IDE Name a New Tab

Copy the following into that file.

/****************************
  This include file, index_OCV_ColorTrack.h, the Client, is an intoduction of OpenCV.js to the ESP32 Camera environment. The Client was
  developed and written by Andrew R. Sass. Permission to reproduce the index_OCV_ColorTrack.h file is granted free of charge if this
  entire copyright notice is included in all copies of the index_OCV_ColorTrack.h file. 
  
  Complete instructions at https://RandomNerdTutorials.com/esp32-cam-opencv-js-color-detection-tracking/
*******************************/
static const char PROGMEM INDEX_HTML[] = R"rawliteral(
<!DOCTYPE html>
<html>
<head>
   <title>ESP32-CAMERA COLOR DETECTION</title>
   <meta charset="utf-8">
   <meta name="viewport" content="width=device-width,initial-scale=1">
   <!----ANN:3--->
   <script async src=" https://docs.opencv.org/master/opencv.js" type="text/javascript"></script>
</head>
<style>
html {
    font-family: Arial, Helvetica, sans-serif;
    }
body { 
    background-color: #F7F7F2;
    margin: 0px;
}
h1 {
    font-size: 1.6rem;
    color:white;
    text-align: center;
}
.topnav {
    overflow: hidden;
    background-color: #0A1128;
}
.main-controls{
  padding-top: 5px;
}
h2 {
    color: #0A1128;
    font-size: 1rem;
}    
.section {
    margin: 2px;
    padding: 10px;
}
.column{
    float: left;
    width: 50%
}
table {
    margin: 0;
    width: 90%;
    border-collapse: collapse;
}
th{
    text-align: center;
}
.row{
    margin-right:50px;
    margin-left:50px;
}

#colorDetect{ 
    border: none;
    color: #FEFCFB;
    background-color: #0A1128;
    padding: 15px;
    text-align: center;
    display: inline-block;
    font-size: 16px;
    border-radius: 4px;
}
#restart{
    border: none;
    color: #FEFCFB;
    background-color: #7B0828;
    padding: 15px;
    text-align: center;
    display: inline-block;
    font-size: 16px;
    border-radius: 4px;  
}
button{
    border: none;
    color: #FEFCFB;
    background-color: #0A1128;
    padding: 10px;
    text-align: center;
    display: inline-block;
    border-radius: 4px;    
}

</style>
<body>
    <div class="topnav">
        <h1>ESP32-CAM Color Detection and Tracking</h1>
    </div>
    <div class="main-controls">
        <table>
            <tr>
                <td><center><input type="button" id="colorDetect" value="COLOR DETECTION"></center></td> 
                <td><center><input type="button" id="restart" value="RESET BOARD"></center></td> 
            </tr>      
        </table>
    </div>
<div class="container">
  <div class = "row"> 
    <div class = "column"> 
        <div class="section">
            <div class ="video-container">
                <h2>Video Streaming</h2>   
                <center><img id="ShowImage" src="" style="display:none"></center>
                <center><canvas id="canvas" style="display:none"></canvas></center>
            </div>
        </div>
        <div class="section">
            <table>
              <tr>
                  <td>Quality</td>
                  <td><input type="range" id="quality" min="10" max="63" value="10"></td>
              </tr>
              <tr>
                  <td>Brightness</td>
                  <td><input type="range" id="brightness" min="-2" max="2" value="0"></td>
              </tr>
              <tr>
                  <td>Contrast</td>
                  <td><input type="range" id="contrast" min="-2" max="2" value="0"></td>
              </tr>
            </table>
        </div>
 
      <!-----ANN:5---->
      <div class="section">
        <h2>RGB Color Trackbars</h2>
        <table>
            <tr>
                <td>R min:&#160;&#160;&#160;<span id="RMINdemo"></span></td>
                <td><input type="range" id="rmin" min="0" max="255" value="0" class = "slider"></td>
                <td>R max:&#160;&#160;&#160;<span id="RMAXdemo"></span></td>
                <td><input type="range" id="rmax" min="0" max="255" value="50" class = "slider"></td>
            </tr>
            <tr>
                <td>G min:&#160;&#160;&#160;<span id="GMINdemo"></span></td>
                <td><input type="range" id="gmin" min="0" max="255" value="0" class = "slider"></td>
                <td>G max:&#160;&#160;&#160;<span id="GMAXdemo"></span></td>
                <td><input type="range" id="gmax" min="0" max="255" value="50" class = "slider"></td>
            </tr>
            <tr>
                <td>B min:&#160;&#160;&#160;<span id ="BMINdemo"></span></td>
                <td><input type="range" id="bmin" min="0" max="255" value="0" class = "slider">  </td>
                <td>B max:&#160;&#160;&#160;<span id="BMAXdemo"></span></td>
                <td> <input type="range" id="bmax" min="0" max="255" value="50" class = "slider">   </td>
            </tr>
        </table>
      </div>

      <div class="section">
        <h2>Threshold Minimum-Binary Image</h2>
        <table>
            <tr>
                <td>Minimum Threshold:&#160;&#160;&#160;<span id="THRESH_MINdemo"></span></td>
                <td><input type="range" id="thresh_min" min="0" max="255" value="120" class = "slider">  </td>
            </tr>
        </table>
    </div>
     <!----ANN:9---> 
     <div class="section">
        <h2>Color Probe</h2>
        <table>
            <tr>
                <td>X probe:&#160;&#160;&#160;<span id="X_PROBEdemo"></span></td>
                <td><input type="range" id="x_probe" min="0" max="400" value="200" class = "slider"></td>
                <td>Y probe:&#160;&#160;&#160;<span id="Y_PROBEdemo"></span></td>
                <td> <input type="range" id="y_probe" min="0" max="296" value="148" class = "slider"></td>
            </tr>
        </table>
      </div>
            
    </div>   <!------endfirstcolumn---------------->   
    
    <div class = "column">      
        <div class="section">
            <h2>Image Mask</h2>
            <canvas id="imageMask"></canvas>
        </div>
        <div class="section">
            <h2>Image Canvas</h2>
            <canvas id="imageCanvas"></canvas>
        </div>
        <div class="section">
            <table>
                <tr>
                    <td><button type="button" id="invertButton" class="btn btn-primary">INVERT</button></td>
                    <td><button type="button" id="contourButton" class="btn btn-primary">SHOW CONTOUR</button></td>
                    <td><button type="button" id="trackButton" class="btn btn-primary">TRACKING</button></td>
                </tr>
                <tr>
                    <td>Invert: <span id="INVERTdemo"></span></td>
                    <td>Contour: <span id="CONTOURdemo"></span></td>
                    <td>Track: <span id="TRACKdemo"></span>
                    </td>
                </tr>
            </table>
        </div>
        <div class="section">
            <table>
                <tr>
                    <td><strong>XCM:</strong> <span id="XCMdemo"></span></td>
                    <td><strong>YCM:</strong> <span id="YCMdemo"></span></td>
                </tr>
            </table>
        </div>
        
        <div class="section">
            <canvas id="textCanvas" width="480" height="180" style= "border: 1px solid #black;"></canvas>
            <iframe id="ifr" style="display:none"></iframe>
            <div id="message"></div>  
        </div>             
        </div>  <!------end2ndcolumn------------------------>
  </div>   <!-----endrow---------------------->   
</div>   <!------endcontainer-------------->
 <!--------------- </body>----------------->
 <!----------------</html>----------------->
<div class="modal"></div>
<script>
var colorDetect = document.getElementById('colorDetect');
var ShowImage = document.getElementById('ShowImage');
var canvas = document.getElementById("canvas");
var context = canvas.getContext("2d");
var imageMask = document.getElementById("imageMask");
var imageMaskContext = imageMask.getContext("2d"); 
var imageCanvas = document.getElementById("imageCanvas");
var imageContext = imageCanvas.getContext("2d"); 
var txtcanvas = document.getElementById("textCanvas");
var ctx = txtcanvas.getContext("2d");  
var message = document.getElementById('message');
var ifr = document.getElementById('ifr');
var myTimer;
var restartCount=0;
const modelPath = 'https://ruisantosdotme.github.io/face-api.js/weights/';
let currentStream;
let displaySize = { width:400, height: 296 }
let faceDetection;

let b_tracker = false;
let x_cm = 0;
let y_cm = 0;

let b_invert = false;

let b_contour = false;

var RMAX=50;
var RMIN=0;
var GMAX=50;
var GMIN=0;
var BMAX=50;
var BMIN=0;
var THRESH_MIN=120;
var X_PROBE=200;
var Y_PROBE=196;
var R=0;
var G=0;
var B=0;
var A=0;


colorDetect.onclick = function (event) {
  clearInterval(myTimer);  
  myTimer = setInterval(function(){error_handle();},5000);
  ShowImage.src=location.origin+'/?colorDetect='+Math.random();
}

//ANN:READY
var Module = {
  onRuntimeInitialized(){onOpenCvReady();}
}

function onOpenCvReady(){
  //alert("onOpenCvReady");
  console.log("OpenCV IS READY!!!");
  drawReadyText();  
  document.body.classList.remove("loading");
}

    
function error_handle() {
  restartCount++;
  clearInterval(myTimer);
  if (restartCount<=2) {
    message.innerHTML = "Get still error. <br>Restart ESP32-CAM "+restartCount+" times.";
    myTimer = setInterval(function(){colorDetect.click();},10000);
    ifr.src = document.location.origin+'?restart';
  }
  else
    message.innerHTML = "Get still error. <br>Please close the page and check ESP32-CAM.";
}    
colorDetect.style.display = "block";
ShowImage.onload = function (event) {
  //alert("SHOW IMAGE");
  console.log("SHOW iMAGE");
  clearInterval(myTimer);
  restartCount=0;      
  canvas.setAttribute("width", ShowImage.width);
  canvas.setAttribute("height", ShowImage.height);
  canvas.style.display = "block";
  imageCanvas.setAttribute("width", ShowImage.width);
  imageCanvas.setAttribute("height", ShowImage.height);
  imageCanvas.style.display = "block";

  imageMask.setAttribute("width", ShowImage.width);
  imageMask.setAttribute("height", ShowImage.height);
  imageMask.style.display = "block";      
      
  context.drawImage(ShowImage,0,0,ShowImage.width,ShowImage.height);
  
  DetectImage();        
}
restart.onclick = function (event) {
  fetch(location.origin+'/?restart=stop');
}
quality.onclick = function (event) {
  fetch(document.location.origin+'/?quality='+this.value+';stop');
} 
brightness.onclick = function (event) {
  fetch(document.location.origin+'/?brightness='+this.value+';stop');
} 
contrast.onclick = function (event) {
  fetch(document.location.origin+'/?contrast='+this.value+';stop');
}                             
async function DetectImage() {
  //alert("DETECT IMAGE");
  console.log("DETECT IMAGE");

  /***************opencv********************************/
  //ANN:4
  let src = cv.imread(ShowImage);
  arows = src.rows;
  acols = src.cols;
  aarea = arows*acols;
  adepth = src.depth();
  atype = src.type();
  achannels = src.channels();
  console.log("rows = " + arows);
  console.log("cols = " + acols);
  console.log("pic area = " + aarea);
  console.log("depth = " + adepth); 
  console.log("type = " + atype); 
  console.log("channels = " + achannels);
  
  /******************COLOR DETECT******************************/

  //ANN:6
  var RMAXslider = document.getElementById("rmax");
  var RMAXoutput = document.getElementById("RMAXdemo");
  RMAXoutput.innerHTML = RMAXslider.value;
  RMAXslider.oninput = function(){
  RMAXoutput.innerHTML = this.value;
  RMAX = parseInt(RMAXoutput.innerHTML,10);
  console.log("RMAX=" + RMAX);
  }

  console.log("RMAX=" + RMAX);

  var RMINslider = document.getElementById("rmin");
  var RMINoutput = document.getElementById("RMINdemo");
  RMINoutput.innerHTML = RMINslider.value;
  RMINslider.oninput = function(){
    RMINoutput.innerHTML = this.value;
    RMIN = parseInt(RMINoutput.innerHTML,10);
    console.log("RMIN=" + RMIN);
  }
  console.log("RMIN=" + RMIN);

  var GMAXslider = document.getElementById("gmax");
  var GMAXoutput = document.getElementById("GMAXdemo");
  GMAXoutput.innerHTML = GMAXslider.value;
  GMAXslider.oninput = function(){
    GMAXoutput.innerHTML = this.value;
    GMAX = parseInt(GMAXoutput.innerHTML,10);
  }
  console.log("GMAX=" + GMAX);

  var GMINslider = document.getElementById("gmin");
  var GMINoutput = document.getElementById("GMINdemo");
  GMINoutput.innerHTML = GMINslider.value;
  GMINslider.oninput = function(){
    GMINoutput.innerHTML = this.value;
    GMIN = parseInt(GMINoutput.innerHTML,10);
  }
  console.log("GMIN=" + GMIN);

  var BMAXslider = document.getElementById("bmax");
  var BMAXoutput = document.getElementById("BMAXdemo");
  BMAXoutput.innerHTML = BMAXslider.value;
  BMAXslider.oninput = function(){
    BMAXoutput.innerHTML = this.value;
    BMAX = parseInt(BMAXoutput.innerHTML,10);
  }
  console.log("BMAX=" + BMAX);

  var BMINslider = document.getElementById("bmin");
  var BMINoutput = document.getElementById("BMINdemo");
  BMINoutput.innerHTML = BMINslider.value;
  BMINslider.oninput = function(){
  BMINoutput.innerHTML = this.value;
  BMIN = parseInt(BMINoutput.innerHTML,10);
  }
  console.log("BMIN=" + BMIN);



  var THRESH_MINslider = document.getElementById("thresh_min");
  var THRESH_MINoutput = document.getElementById("THRESH_MINdemo");
  THRESH_MINoutput.innerHTML = THRESH_MINslider.value;
  THRESH_MINslider.oninput = function(){
  THRESH_MINoutput.innerHTML = this.value;
  THRESH_MIN = parseInt(THRESH_MINoutput.innerHTML,10);
  }
  console.log("THRESHOLD MIN=" + THRESH_MIN);

  //ANN:9A
  var X_PROBEslider = document.getElementById("x_probe");
  var X_PROBEoutput = document.getElementById("X_PROBEdemo");
  X_PROBEoutput.innerHTML = X_PROBEslider.value;
  X_PROBEslider.oninput = function(){
  X_PROBEoutput.innerHTML = this.value;
  X_PROBE = parseInt(X_PROBEoutput.innerHTML,10);
  }
  console.log("X_PROBE=" + X_PROBE); 

  var Y_PROBEslider = document.getElementById("y_probe");
  var Y_PROBEoutput = document.getElementById("Y_PROBEdemo");
  Y_PROBEoutput.innerHTML = Y_PROBEslider.value;
  Y_PROBEslider.oninput = function(){
  Y_PROBEoutput.innerHTML = this.value;
  Y_PROBE = parseInt(Y_PROBEoutput.innerHTML,10);
  }
  console.log("Y_PROBE=" + Y_PROBE); 


  document.getElementById('trackButton').onclick = function(){
    b_tracker = (true && !b_tracker)  
    console.log("TRACKER = " + b_tracker );
    var TRACKoutput = document.getElementById("TRACKdemo");
    TRACKoutput.innerHTML = b_tracker;
    //var XCMoutput = document.getElementById("XCMdemo");
    //XCMoutput.innerHTML = x_cm;
 
  }  

  document.getElementById('invertButton').onclick = function(){
    b_invert = (true && !b_invert)  
    console.log("TRACKER = " + b_invert );
    var INVERToutput = document.getElementById("INVERTdemo");
    INVERToutput.innerHTML = b_invert;
  }  
/**/
  document.getElementById('contourButton').onclick = function(){
    b_contour = (true && !b_contour)  
    console.log("TRACKER = " + b_contour );
    var CONTOURoutput = document.getElementById("CONTOURdemo");
    CONTOURoutput.innerHTML = b_contour;
  } 
/**/ 

  let tracker = 0;
  
  var TRACKoutput = document.getElementById("TRACKdemo");
  TRACKoutput.innerHTML = b_tracker;
  var XCMoutput = document.getElementById("XCMdemo");
  var YCMoutput = document.getElementById("YCMdemo");

  XCMoutput.innerHTML = 0;
  YCMoutput.innerHTML = 0; 

  var INVERToutput = document.getElementById("INVERTdemo");
  INVERToutput.innerHTML = b_invert;  

  var CONTOURoutput = document.getElementById("CONTOURdemo");
  CONTOURoutput.innerHTML = b_contour;   

  //ANN:8
  let M00Array = [0,];
  let orig = new cv.Mat();
  let mask = new cv.Mat();
  let mask1 = new cv.Mat();
  let mask2 = new cv.Mat();
  let contours = new cv.MatVector();
  let hierarchy = new cv.Mat();
  let rgbaPlanes = new cv.MatVector();
    
  let color = new cv.Scalar(0,0,0);

  clear_canvas();


    
  orig = cv.imread(ShowImage);
  cv.split(orig,rgbaPlanes);  //SPLIT
  let BP = rgbaPlanes.get(2);  // SELECTED COLOR PLANE
  let GP = rgbaPlanes.get(1);
  let RP = rgbaPlanes.get(0);
  cv.merge(rgbaPlanes,orig);
   
    
              //   BLK    BLU   GRN   RED
  let row = Y_PROBE //180//275 //225 //150 //130    
  let col = X_PROBE //100//10 //100 //200 //300
  drawColRowText(acols,arows);


  console.log("ISCONTINUOUS = " + orig.isContinuous());

  //ANN:9C
  R = src.data[row * src.cols * src.channels() + col * src.channels()];
  G = src.data[row * src.cols * src.channels() + col * src.channels() + 1];
  B = src.data[row * src.cols * src.channels() + col * src.channels() + 2];
  A = src.data[row * src.cols * src.channels() + col * src.channels() + 3];
  console.log("RDATA = " + R);
  console.log("GDATA = " + G);
  console.log("BDATA = " + B);
  console.log("ADATA = " + A);

  drawRGB_PROBE_Text();
  
   
    
  //ANN:9b
  //*************draw probe point*********************
  let point4 = new cv.Point(col,row);
  cv.circle(src,point4,5,[255,255,255,255],2,cv.LINE_AA,0);
  //***********end draw probe point*********************

  //ANN:7
  let high = new cv.Mat(src.rows,src.cols,src.type(),[RMAX,GMAX,BMAX,255]);
  let low = new cv.Mat(src.rows,src.cols,src.type(),[RMIN,GMIN,BMIN,0]);

  cv.inRange(src,low,high,mask1);
  //inRange(source image, lower limit, higher limit, destination image)
    
  cv.threshold(mask1,mask,THRESH_MIN,255,cv.THRESH_BINARY);
  //threshold(source image,destination image,threshold,255,threshold method);

  //ANN:9
  if(b_invert==true){
     cv.bitwise_not(mask,mask2);
  }
/********************start contours******************************************/
  //ANN:10
  if(b_tracker == true){
  try{
   if(b_invert==false){
    //ANN:11   
    cv.findContours(mask,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
    //findContours(source image, array of contours found, hierarchy of contours
        // if contours are inside other contours, method of contour data retrieval,
        //algorithm method)
   }
   else{
    cv.findContours(mask2,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
   }
    console.log("CONTOUR_SIZE = " + contours.size());

    //draw contours
    if(b_contour==true){
     for(let i = 0; i < contours.size(); i++){
        cv.drawContours(src,contours,i,[0,0,0,255],2,cv.LINE_8,hierarchy,100)
     }
    }

    //ANN:12
    let cnt;
    let Moments;
    let M00;
    let M10;
    //let x_cm;
    //let y_cm;
    
    //ANN:13
    for(let k = 0; k < contours.size(); k++){
        cnt = contours.get(k); 
        Moments = cv.moments(cnt,false);
        M00Array[k] = Moments.m00;
       // cnt.delete();
    }

    //ANN13A
    let max_area_arg = MaxAreaArg(M00Array);
    console.log("MAXAREAARG = "+max_area_arg);

    //let TestArray = [0,0,0,15,4,15,2];
    //let TestArray0 = [];
    //let max_test_area_arg = MaxAreaArg(TestArray0);
    //console.log("MAXTESTAREAARG = "+max_test_area_arg);



    let ArgMaxArea = MaxAreaArg(M00Array);
    if(ArgMaxArea >= 0){
    cnt = contours.get(MaxAreaArg(M00Array));  //use the contour with biggest MOO
    //cnt = contours.get(54);
    Moments = cv.moments(cnt,false);
    M00 = Moments.m00;
    M10 = Moments.m10;
    M01 = Moments.m01;
    x_cm = M10/M00;    // 75 for circle_9.jpg
    y_cm = M01/M00;    // 41 for circle_9.jpg

    XCMoutput.innerHTML = Math.round(x_cm);
    YCMoutput.innerHTML = Math.round(y_cm);

    console.log("M00 = "+M00);  
    console.log("XCM = "+Math.round(x_cm));
    console.log("YCM = "+Math.round(y_cm)); 

    //fetch(document.location.origin+'/?xcm='+Math.round(x_cm)+';stop');
    fetch(document.location.origin+'/?cm='+Math.round(x_cm)+';'+Math.round(y_cm)+';stop');

    console.log("M00ARRAY = " + M00Array);

    //ANN:14   
    
    //**************min area bounding rect********************
    //let rotatedRect=cv.minAreaRect(cnt);
    //let vertices = cv.RotatedRect.points(rotatedRect);

    //for(let j=0;j<4;j++){
    //    cv.line(src,vertices[j],
    //        vertices[(j+1)%4],[0,0,255,255],2,cv.LINE_AA,0);
    //}
    //***************end min area bounding rect*************************************


    //***************bounding rect***************************
    let rect = cv.boundingRect(cnt);
    let point1 = new cv.Point(rect.x,rect.y);
    let point2 = new cv.Point(rect.x+rect.width,rect.y+rect.height);

    cv.rectangle(src,point1,point2,[0,0,255,255],2,cv.LINE_AA,0);
    //*************end bounding rect***************************


    //*************draw center point*********************
    let point3 = new cv.Point(x_cm,y_cm);
    cv.circle(src,point3,2,[0,0,255,255],2,cv.LINE_AA,0);
    //***********end draw center point*********************

    }//end if(ArgMaxArea >= 0)
    else{
      if(ArgMaxArea==-1){ 
        console.log("ZERO ARRAY LENGTH");
      }
      else{              //ArgMaxArea=-2
        console.log("DUPLICATE MAX ARRAY-ELEMENT");
      }
    }




    cnt.delete();
/******************end contours  note cnt line one up*******************************************/
   drawXCM_YCM_Text();

  }//end try
  catch{
    console.log("ERROR TRACKER NO CONTOUR");
    clear_canvas();
    drawErrorTracking_Text();
  }
    
  }//end b_tracking if statement
  else{
      XCMoutput.innerHTML = 0;
      YCMoutput.innerHTML = 0;
  }    

  if(b_invert==false){
     cv.imshow('imageMask', mask);
  }
  else{
     cv.imshow('imageMask', mask2);
  }
  //cv.imshow('imageMask', R);
  cv.imshow('imageCanvas', src);

  //ANN:8A
  src.delete();
  high.delete();
  low.delete();
  orig.delete();
  mask1.delete();
  mask2.delete();
  mask.delete();
  contours.delete();
  hierarchy.delete();
  //cnt.delete();
  RP.delete();
    
  


 /********************END COLOR DETECT****************************/
  
/***************end opencv******************************/
      

 setTimeout(function(){colorDetect.click();},500);
  
}//end detectimage 


function MaxAreaArg(arr){
    if (arr.length == 0) {
        return -1;
    }

    var max = arr[0];
    var maxIndex = 0;
    var dupIndexCount = 0; //duplicate max elements?

    if(arr[0] >= .90*aarea){
        max = 0;
    }

    for (var i = 1; i < arr.length; i++) {
        if (arr[i] > max && arr[i] < .99*aarea) {
            maxIndex = i;
            max = arr[i];
            dupIndexCount = 0;
        }
        else if(arr[i]==max && arr[i]!=0){
            dupIndexCount++;
        }
    }

    if(dupIndexCount==0){
        return maxIndex;
    }

    else{
        return -2;
    }        
}//end MaxAreaArg    



function clear_canvas(){
    ctx.clearRect(0,0,txtcanvas.width,txtcanvas.height);
    ctx.rect(0,0,txtcanvas.width,txtcanvas.height);
    ctx.fillStyle="red";
    ctx.fill();
}

function drawReadyText(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('OpenCV.JS READY',txtcanvas.width/4,txtcanvas.height/10);
}          

function drawColRowText(x,y){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('ImageCols='+x,0,txtcanvas.height/10);
    ctx.fillText('ImageRows='+y,txtcanvas.width/2,txtcanvas.height/10);
} 

function drawRGB_PROBE_Text(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('Rp='+R,0,2*txtcanvas.height/10);
    ctx.fillText('Gp='+G,txtcanvas.width/4,2*txtcanvas.height/10);
    ctx.fillText('Bp='+B,txtcanvas.width/2,2*txtcanvas.height/10);
    ctx.fillText('Ap='+A,3*txtcanvas.width/4,2*txtcanvas.height/10);
}

function drawXCM_YCM_Text(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('XCM='+Math.round(x_cm),0,3*txtcanvas.height/10); 
    ctx.fillText('YCM='+Math.round(y_cm),txtcanvas.width/4,3*txtcanvas.height/10);    
}

function drawErrorTracking_Text(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('ERROR TRACKING-NO CONTOUR',0,3*txtcanvas.height/10);
}          
         
  </script> 
</body>
</html>  
)rawliteral";

View raw code

Save the file.

Network Credentials

For the program to work properly, you need to insert your network credentials in the following variables in the OCV_ColorTrack_P.ino file:

const char* ssid = "REPLACE_WITH_YOUR_SSID";
const char* password = "REPLACE_WITH_YOUR_PASSWORD";

Camera Pin Assignment

By default, the code uses the pin assignment for the ESP32-CAM AI-Thinker module.

#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27
#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

If you’re using a different camera board, don’t forget to insert the right pin assignment, you can go to the following article to find the pinout for your board:

How the Code Works

Continue reading to learn how the code works, or skip to the next section.

In order to facilitate reading and understanding the program, ANNOTATIONS have been added as comments in the program.

For example, wiring for ESP32-CAM is listed below the ANN:0 annotation, located in the .ino file. ANN:0 is found with the Edit/Find command of the Arduino IDE.

#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27
#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

Server Sketch

The server program, OCV_ColorTrack.ino, is taken from ESP32-CAM Projects, Module 5 by Rui Santos and Sara Santos. It has a standard ESP32 Camera setup() which configures the Camera and server IP address and password.

Annotation 1 (ANN:1)

However, what is not standard, in this server program are instructions of vital importance, which allow Access-Control. See code at ANN:1.

//ANN:1
client.println("HTTP/1.1 200 OK");
client.println("Access-Control-Allow-Origin: *");              
client.println("Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept");
client.println("Access-Control-Allow-Methods: GET,POST,PUT,DELETE,OPTIONS");
client.println("Content-Type: image/jpeg");
client.println("Content-Disposition: form-data; name=\"imageFile\"; filename=\"picture.jpg\""); 
client.println("Content-Length: " + String(fb->len));             
client.println("Connection: close");
client.println();

This instructs the browser to allow the Camera image and OpenCV.js, which have different origins to work together in the program. Without these instructions, the Chrome browser throws errors.

Annotation 2 (ANN:2)

The server loop() monitors client messages and decodes them via an ExecuteCommand() found at ANN:2.

//ANN:2
void ExecuteCommand() {
  if (cmd!="colorDetect") {  //Omit printout
    //Serial.println("cmd= "+cmd+" ,P1= "+P1+" ,P2= "+P2+" ,P3= "+P3+" ,P4= "+P4+" ,P5= "+P5+" ,P6= "+P6+" ,P7= "+P7+" ,P8= "+P8+" ,P9= "+P9);
    //Serial.println("");
  }
  
  if (cmd=="resetwifi") {
    WiFi.begin(P1.c_str(), P2.c_str());
    Serial.print("Connecting to ");
    Serial.println(P1);
    long int StartTime=millis();
    while (WiFi.status() != WL_CONNECTED) 
    {
        delay(500);
        if ((StartTime+5000) < millis()) break;
    } 
    Serial.println("");
    Serial.println("STAIP: "+WiFi.localIP().toString());
    Feedback="STAIP: "+WiFi.localIP().toString();
  }    
  else if (cmd=="restart") {
    ESP.restart();
  }
  else if (cmd=="cm"){
    int XcmVal = P1.toInt();
    int YcmVal = P2.toInt();
    Serial.println("cmd= "+cmd+" ,VALXCM= "+XcmVal);
    Serial.println("cmd= "+cmd+" ,VALYCM= "+YcmVal);   
  }
  else if (cmd=="quality") { 
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_quality(s, val);
  }
  else if (cmd=="contrast") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_contrast(s, val);
  }
  else if (cmd=="brightness") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt();  
    s->set_brightness(s, val);  
  }   
  else {
    Feedback="Command is not defined.";
  }
  if (Feedback=="") {
    Feedback=Command;
  }
}

The original program uses this function to receive and execute sliders in the client which controls image characteristics and which are transmitted by the client via a “fetch” instruction.

In our current program, this feature, to be described, is used to communicate the “center-of-mass” of a color target detected by the client to the ESP32 server; a feature vital to a robotics application.

Other than the change of extracting the x and y center-of-mass and printing it, there are no other changes to the server program.

Client Sketch (OpenCV.js)

Other that the image-characteristics-sliders-routines and their data transmission to the server via “fetch” and the error routine in the original client program eBook refenced above, the client program here is new and contains code devoted to the application of OpenCV.js to the ESP32 Camera image transmitted to the browser (as mentioned previously, a “fetch” is used to transmit color target data to the server).

The client code is sprinkled liberally with console.log instructions which allow the user to see the results of the code. Chrome console.log is accessed by pressing CTRL+SHIFT+J simultaneously.

Annotation 3 (ANN:3)

ANN:3 includes the latest version of OpenCV.js in our web page. Click here to learn more.

<script async src=" https://docs.opencv.org/master/opencv.js" type="text/javascript"></script>

ANN:READY

ANN:READY marks the module which signals that OpenCV.js has been initialized. Once initialization is complete, the Color Detection button can be clicked. While faster computers do not require this capability, it is included for the sake of completeness.

OpenCVJS ready ESP32-CAM Web Server

Annotation 4 (ANN:4)

The screenshot of the client program, running on Chrome shows two columns as created by the HTML section of the code. The left column shows the original image of the camera which is transmitted at approximately 1 fps. This image, with an ID of ShowImage is the source image of the OpenCV code routine in the program.

ANN:4 marks the creation of the src and its characteristics; rows, cols, etc.

//ANN:4
let src = cv.imread(ShowImage);
arows = src.rows;
acols = src.cols;
aarea = arows*acols;
adepth = src.depth();
atype = src.type();
achannels = src.channels();
console.log("rows = " + arows);
console.log("cols = " + acols);
console.log("pic area = " + aarea);
console.log("depth = " + adepth); 
console.log("type = " + atype); 
console.log("channels = " + achannels);

RGB Color Trackbars

Below the source image are the original three image-characteristics sliders (Quality, Brightness and Contrast), there are RGB Color Trackbars.

ESP32-CAM Web Server Color Tracking RGB Trackbars OpenCVJS

These are used to set limits to the color range of colors allowed in the “processed” image in the CV application. Code for the trackbars are found at ANN:5, ANN:6.

<!-----ANN:5---->
<div class="section">
<h2>RGB Color Trackbars</h2>
<table>
  <tr>
    <td>R min:&#160;&#160;&#160;<span id="RMINdemo"></span></td>
    <td><input type="range" id="rmin" min="0" max="255" value="0" class = "slider"></td>
    <td>R max:&#160;&#160;&#160;<span id="RMAXdemo"></span></td>
    <td><input type="range" id="rmax" min="0" max="255" value="50" class = "slider"></td>
  </tr>
  <tr>
    <td>G min:&#160;&#160;&#160;<span id="GMINdemo"></span></td>
    <td><input type="range" id="gmin" min="0" max="255" value="0" class = "slider"></td>
    <td>G max:&#160;&#160;&#160;<span id="GMAXdemo"></span></td>
    <td><input type="range" id="gmax" min="0" max="255" value="50" class = "slider"></td>
  </tr>
  <tr>
    <td>B min:&#160;&#160;&#160;<span id ="BMINdemo"></span></td>
    <td><input type="range" id="bmin" min="0" max="255" value="0" class = "slider"></td>
    <td>B max:<span id="BMAXdemo"></span></td>
    <td> <input type="range" id="bmax" min="0" max="255" value="50" class = "slider"></td>
  </tr>
</table>
</div>
//ANN:6
var RMAXslider = document.getElementById("rmax");
var RMAXoutput = document.getElementById("RMAXdemo");
RMAXoutput.innerHTML = RMAXslider.value;
RMAXslider.oninput = function() {
  RMAXoutput.innerHTML = this.value;
  RMAX = parseInt(RMAXoutput.innerHTML,10);
  console.log("RMAX=" + RMAX);
}
console.log("RMAX=" + RMAX);

var RMINslider = document.getElementById("rmin");
var RMINoutput = document.getElementById("RMINdemo");
RMINoutput.innerHTML = RMINslider.value;
RMINslider.oninput = function(){
RMINoutput.innerHTML = this.value;
  RMIN = parseInt(RMINoutput.innerHTML,10);
  console.log("RMIN=" + RMIN);
}
console.log("RMIN=" + RMIN);

var GMAXslider = document.getElementById("gmax");
var GMAXoutput = document.getElementById("GMAXdemo");
GMAXoutput.innerHTML = GMAXslider.value;
GMAXslider.oninput = function(){
  GMAXoutput.innerHTML = this.value;
  GMAX = parseInt(GMAXoutput.innerHTML,10);
}
console.log("GMAX=" + GMAX);

var GMINslider = document.getElementById("gmin");
var GMINoutput = document.getElementById("GMINdemo");
GMINoutput.innerHTML = GMINslider.value;
GMINslider.oninput = function(){
  GMINoutput.innerHTML = this.value;
  GMIN = parseInt(GMINoutput.innerHTML,10);
}
console.log("GMIN=" + GMIN);

var BMAXslider = document.getElementById("bmax");
var BMAXoutput = document.getElementById("BMAXdemo");
BMAXoutput.innerHTML = BMAXslider.value;
BMAXslider.oninput = function(){
  BMAXoutput.innerHTML = this.value;
  BMAX = parseInt(BMAXoutput.innerHTML,10);
}
console.log("BMAX=" + BMAX);

var BMINslider = document.getElementById("bmin");
var BMINoutput = document.getElementById("BMINdemo");
BMINoutput.innerHTML = BMINslider.value;
BMINslider.oninput = function(){
  BMINoutput.innerHTML = this.value;
  BMIN = parseInt(BMINoutput.innerHTML,10);
}
console.log("BMIN=" + BMIN);

The maximum and minimum values of red, green, and blue (RGB)  are applied to the OpenCV function, inRange(), at ANN:7.

let high = new cv.Mat(src.rows,src.cols,src.type(),[RMAX,GMAX,BMAX,255]);
let low = new cv.Mat(src.rows,src.cols,src.type(),[RMIN,GMIN,BMIN,0]);

cv.inRange(src,low,high,mask1);
//inRange(source image, lower limit, higher limit, destination image)
    
cv.threshold(mask1,mask,THRESH_MIN,255,cv.THRESH_BINARY);
//threshold(source image,destination image,threshold,255,threshold method);

The  image is 4channel; RGBA where A is the level of transparency. In this tutorial, A will set set at 100% opacity, namely 255. The code is based on the fact that, besides the A plane, the image has 3 color planes, RGB, each pixel in each plane having a value between 0 and 255. The high/low limits are applied to the corresponding color planes for each pixel.

Note that inRange() has a destination image which has been created previously in the program (ANN:8).

let M00Array = [0,];
let orig = new cv.Mat();
let mask = new cv.Mat();
let mask1 = new cv.Mat();
let mask2 = new cv.Mat();
let contours = new cv.MatVector();
let hierarchy = new cv.Mat();
let rgbaPlanes = new cv.MatVector();
    
let color = new cv.Scalar(0,0,0);

clear_canvas();

orig = cv.imread(ShowImage);
cv.split(orig,rgbaPlanes);  //SPLIT
let BP = rgbaPlanes.get(2);  // SELECTED COLOR PLANE
let GP = rgbaPlanes.get(1);
let RP = rgbaPlanes.get(0);
cv.merge(rgbaPlanes,orig);

Important: every image created in an OpenCV program has to be deleted to avoid computer memory leakage (ANN:8A).

src.delete();
high.delete();
low.delete();
orig.delete();
mask1.delete();
mask2.delete();
mask.delete();
contours.delete();
hierarchy.delete();
//cnt.delete();
RP.delete();

The destination image Mask1 is not shown in the program although it could be. However, it is used by the threshold() function immediately following inRange().

The threshold() function examines the composite source image pixel value and sets the corresponding destination value at either 0 or 255 depending on whether the source value is less or greater than the threshold value. The top image in the right hand column shows this binary image.

For the sake of completeness, an invert capability has been added to the binary image. When the INVERT button in the web page is clicked, the binary image is inverted (black becomes white, white becomes black) and subsequent processing is performed on the new image. The button is bistable, so that a second push returns the binary image to its original state.

Target-color Probe

ESP32-CAM Color Tracking Color Probe OpenCVJS

In the screenshot, a red cap is the target in an ordinary room environment with an ordinary 60W fluorescent lamp. The lamp emits red, green, and blue. The red cap reflects red, green, and blue but principally red. The method of detecting the amount of each reflected color will be described now. This method allows the RGB trackbars to be set with minimal effort. Its use is strongly advised.

ESP32-CAM OpenCVJS Original Image Mask and Color Tracking
Image courtesy of Andrew R. Sass

The method involves using the Color Probe sliders. These two sliders, X and Y Probe are used to place a small white circle probe in a desired position in the bottom image of the right-hand column. The RGB values in this probe position are measured and used to set the inRange() RGB maximums and minimums described previously.

See ANN:9,9A,9B,9C for the code associated with this probe.

When the optimum values for a desired target are found using the X, Y probe and set by the trackbar, the target in the binary image is white and the remainder of the image is black, ideally, as shown in the screenshot.

This ideal typically can be  realized only when lighting conditions can be closely controlled. Indoor, standard room lighting is acceptable. Filters can be used for optimal results but were not used here.

Here’s another example:

ESP32-CAM OpenCVJS Original Image Mask and Color Tracking Example

Tracking

Once the binary image is deemed acceptable, the TRACKING button, which is bistable, can be clicked. ANN:10 marks the beginning of the tracking routine.

//ANN:10
if(b_tracker == true){
try{
 if(b_invert==false){

Since, as mentioned above, this article is not concerned with the INVERT capability, only the b_invert equal to false is of interest

ANN:11 The first step in the tracking is findContours, which is the OpenCV algorithm which finds the contours of all the white objects in the binary image.

//ANN:11   
    cv.findContours(mask,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
//findContours(source image, array of contours found, hierarchy of contours
// if contours are inside other contours, method of contour data retrieval,
//algorithm method)
}
else{
  cv.findContours(mask2,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
}
console.log("CONTOUR_SIZE = " + contours.size());

//draw contours
if(b_contour==true){
  for(let i = 0; i < contours.size(); i++){
    cv.drawContours(src,contours,i,[0,0,0,255],2,cv.LINE_8,hierarchy,100)
  }
}

If the tracking button is pressed when the binary image is fully black,  the instructions  depending on findContours output will throw exceptions; the try-catch allows the program to continue safely, posting an output in the console log and the text box.

Contours.sizecontours is the output of findContours and is an array of contours of the white object(s) found in the binary image. Contours.size() finds the number of elements in the array. The hierarchy (contours inside other contours) output is not of concern here as there will be no white objects (outlined in black) inside other white objects.

ANN:12 Marks the beginning of finding the moments of the contours found.

/ANN:12
let cnt;
let Moments;
let M00;
let M10;

M00 is the zeroth moment-the “area” enclosed by a contour. In OpenCv it is actually the number of pixels enclosed by the contour. M10 and M01 are the x and y coordinate-weighted number of pixels enclosed.

As usual, the origin of the x,y coordinate system is at the upper left corner of the image. X is positive horizontal to the right and Y is positive vertical down. Therefore M10/M00 and M01/M00 are the x,y coordinates of the centroid of a contour in the array.

ANN:13,13A marks finding the largest area contour in the array of contours using the MaxAreaArg function and transmitting the centroid, x_cm, y_cm to the ESP32 via a fetch instruction.

//ANN:13
for(let k = 0; k < contours.size(); k++){
  cnt = contours.get(k); 
  Moments = cv.moments(cnt,false);
  M00Array[k] = Moments.m00;
  // cnt.delete();
}

//ANN13A
let max_area_arg = MaxAreaArg(M00Array);
console.log("MAXAREAARG = "+max_area_arg);

//let TestArray = [0,0,0,15,4,15,2];
//let TestArray0 = [];
//let max_test_area_arg = MaxAreaArg(TestArray0);
//console.log("MAXTESTAREAARG = "+max_test_area_arg);

let ArgMaxArea = MaxAreaArg(M00Array);
if(ArgMaxArea >= 0){
cnt = contours.get(MaxAreaArg(M00Array));  //use the contour with biggest MOO
//cnt = contours.get(54);
Moments = cv.moments(cnt,false);
M00 = Moments.m00;
M10 = Moments.m10;
M01 = Moments.m01;
x_cm = M10/M00;    // 75 for circle_9.jpg
y_cm = M01/M00;    // 41 for circle_9.jpg

XCMoutput.innerHTML = Math.round(x_cm);
YCMoutput.innerHTML = Math.round(y_cm);

console.log("M00 = "+M00);  
console.log("XCM = "+Math.round(x_cm));
console.log("YCM = "+Math.round(y_cm)); 

//fetch(document.location.origin+'/?xcm='+Math.round(x_cm)+';stop');
fetch(document.location.origin+'/?cm='+Math.round(x_cm)+';'+Math.round(y_cm)+';stop');

console.log("M00ARRAY = " + M00Array);

During the running of the program, the centroid coordinates are seen printed in the serial monitor as well as in the console.log and in the text box in the browser screen. The ESP32 can use the centroid data for tracking purposes in robotic applications.

ANN:14 Marks code for a blue bounding rectangle which bounds the largest area contour and the centroid of that contour. These can be seen in the lower image in the right hand column of the browser screen.

//ANN:14   
    
//**************min area bounding rect********************
//let rotatedRect=cv.minAreaRect(cnt);
//let vertices = cv.RotatedRect.points(rotatedRect);

//for(let j=0;j<4;j++){
//    cv.line(src,vertices[j],
//        vertices[(j+1)%4],[0,0,255,255],2,cv.LINE_AA,0);
//}
//***************end min area bounding rect*************************************


//***************bounding rect***************************
let rect = cv.boundingRect(cnt);
let point1 = new cv.Point(rect.x,rect.y);
let point2 = new cv.Point(rect.x+rect.width,rect.y+rect.height);

cv.rectangle(src,point1,point2,[0,0,255,255],2,cv.LINE_AA,0);
//*************end bounding rect***************************

//*************draw center point*********************
let point3 = new cv.Point(x_cm,y_cm);
cv.circle(src,point3,2,[0,0,255,255],2,cv.LINE_AA,0);
//***********end draw center point*********************

}//end if(ArgMaxArea >= 0)
else{
  if(ArgMaxArea==-1){ 
    console.log("ZERO ARRAY LENGTH");
  }
  else{              //ArgMaxArea=-2
    console.log("DUPLICATE MAX ARRAY-ELEMENT");
  }
}

cnt.delete();

Below the lower image in the right-hand column, a text box contains selected outputs of the program including the X, Y Probe data, the centroid coordinates, and a catch output if an exception is generated as mentioned above.

ESP32-CAM Color Tracking Output Messages X Y coordinates

Uploading the Code

After inserting your network credentials and the pinout for the camera you’re using you can upload the code.

In the Tools menu select the following settings before uploading the code to your board.

ESP32-CAM Wrover Upload Options
  • BOARD: ESP32 Wrover Module
  • Flash Mode: “QIO”
  • PARTITION SCHEME: “Huge App (3Mb No OTA/1MB SPIFFS)”
  • Flash Frequency: “80 Mhz”
  • Upload Speed: “115200”
  • Core Debug Level: “None”

Testing the Program

After uploading the code, open the Serial Monitor at a baud rate of 115200. Press the on-board RST button, and the ESP IP address should be printed. In this case, the IP address is 192.168.1.95.

ESP32-CAM Web Server Color Tracking Demonstration OpenCV.js IP Address

Open a browser on your local network and type de ESP32-CAM IP address.

Open the console log when the browser opens. Check if OpenCV.js is loading properly. At the bottom right corner of the web page, it should display “OpenCV.JS READY”.

Then left-click the Color Detection button in the upper left column of the browser window.

You should see a similar window and no error messages.

ESP32-CAM Web Server Color Tracking Demonstration OpenCV.js

After setting the right settings to target a color in the Target-color Probe (as explained previously), click the Tracking button.

At the same time, the centroid coordinates of the target should be displayed on the web page as well as on the ESP32-CAM Serial Monitor.

ESP32-CAM Web Server Color Tracking Demonstration OpenCV.js Arduino IDE Serial Monitor

Wrapping Up

None of the elements of the project described in this tutorial are new. The ESP32 Camera Web Server and OpenCV have each been described extensively and in detail in the literature.

The novelty here has been the combining of these two technologies via OpenCV.js. ESP32 Camera, with its small size, wi-fi, high tech and low-cost capability promises to be an interesting new front-end image-capture capability for OpenCV web server applications.

Learn more about the ESP32-CAM

We hope you liked this project. Learn more about the ESP32-CAM with our tutorials:

About Andrew R. Sass

This project/tutorial was developed by Andrew R. Sass. We’ve edited the tutorial to match our tutorials’ style. Apart from some CSS, the code is the original provided by Andrew.

Author background: Andrew (“DOC”) R. Sass holds a BSEE(MIT), MSEE & PhD EE (PURDUE). He is a retired research engineer (integrated circuit components), a second-career retired teacher (AP Physics, Physics, Robotics), and has been a mentor of a local FIRST robotics team.



Learn how to build a home automation system and we’ll cover the following main subjects: Node-RED, Node-RED Dashboard, Raspberry Pi, ESP32, ESP8266, MQTT, and InfluxDB database DOWNLOAD »
Learn how to build a home automation system and we’ll cover the following main subjects: Node-RED, Node-RED Dashboard, Raspberry Pi, ESP32, ESP8266, MQTT, and InfluxDB database DOWNLOAD »

Enjoyed this project? Stay updated by subscribing our newsletter!

25 thoughts on “ESP32-CAM Web Server with OpenCV.js: Color Detection and Tracking”

  1. Just a sugestion to change in the code on the beginning of the loop() function:
    From: “ReceiveState=0,cmdState=1,strState=1,questionstate=0,equalstate=0,semicolonstate=0;”
    To: “ReceiveState = 0; cmdState = 1; strState = 1; questionstate = 0; equalstate = 0; semicolonstate = 0;” (Changing the commas for semicolons).

    By the way, great post. I’m reading and writing the code yet, but it is being a great time. Thanks!

    Reply
  2. Hi Sara, great project, thanks! As far as I know, there should be the TensorFlow framework on top of OpenCV both for Computer Vision (CV). Are I wrong ?
    Thanks!

    Reply
  3. Hi Sara,
    to speed up the tracking I changed the timout in line 710 to 50 ms:
    setTimeout(function(){colorDetect.click();},50);
    The tracking is much more like realtime then.
    Additionally the values in the Serial Monitor are much faster when I comment out line 265 in the main program ( // (delay(1); ) . This works well after a fresh start.
    All together this is a very nice project, I am impressed. Thank you for this work.

    Reply
  4. Great tutorial Sara!

    Is it possible to run this code without web server and only get X axis and Y axis value? Because I want to try make something similar like PixyCam object tracking, Thank you.

    Reply
  5. Very great tutorial, and it works very well, although the camera is very slow to detect every movement. Also a quick question, how can i send the color data from the color detection?

    Reply
  6. Thank you for a great article that is working well on my ESP32-CAM.

    Please can someone tell me how I can modify this line:

    https://docs.opencv.org/master/opencv.js

    So that I can read this js file from my SD card ? I want to run this offline and not have to download a huge file from the net every time.

    I need to ask because this software uses the WiFi server and not a webserver, which has methods to do this.

    Any help sincerely appreciated.

    Jim.

    Reply
  7. Hi Sara
    I have read your project and it’s very amazing. I am wondering if you can help me with the Harvesting fruit robot arm project. Please let me know

    Reply
  8. Hi Sara, I’m running your project using vscode and pltaform IO. I can build and upload the project into the esp32 cam, but once getting and using the IP address it doesn’t show any video only the webpage with all the setting and it shows the open cv is ready. I don’t seem to find the issue, may I please get an advice of where to look or possible solutions?

    Reply
  9. Hi Sara,
    i tried this project and the only picture i got on the local webpage is a still image instead of video stream at the left top instead and the still image gets updated only when i click on color detection. nothing displays in IMAGE MASK and IMAGE CANVAS. please can you give me an hint of how to resolve this.

    Reply
    • I have resolved the problem. i had a bad internet connection. when the internet was available and the READY sign displays…then everything works fine. Thanks…good project.

      Reply
  10. hi thank you very much!
    i would like when i detection some color to output voltage from one of the pins of the camera can i ?

    Reply
  11. Hi Sara
    Thank you for the great project!
    How could I fix the IP? The third number is changed after discounting the IP-Camera.
    Cheers
    Saman

    Reply
  12. E (487) camera: Camera probe failed with error 0x105(ESP_ERR_NOT_FOUND)
    Camera init failed with error 0x105ets Jun 8 2016 00:22:57

    rst:0xc (SW_CPU_RESET),boot:0x33 (SPI_FAST_FLASH_BOOT)
    configsip: 0, SPIWP:0xee
    clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
    mode:DIO, clock div:1
    load:0x3fff0030,len:1344
    load:0x40078000,len:13964
    load:0x40080400,len:3600
    entry 0x400805f0

    Reply

Leave a Reply to Sara Santos Cancel reply

Download Our Free eBooks and Resources

Get instant access to our FREE eBooks, Resources, and Exclusive Electronics Projects by entering your email address below.