ESP32-CAM Web Server with OpenCV.js: Color Detection and Tracking

This guide introduces OpenCV.js and OpenCV tools for the ESP32 Camera Web Server environment. As an example, we’ll build a simple ESP32 Camera Web Server that includes color detection and tracking of a moving object.

This tutorial is by no means an exhaustive treatment of all that OpenCV can offer to ESP32 camera web servers. It is expected that this introduction will inspire additional OpenCV work with the ESP32 cameras.

This project/tutorial was created based on Andrew R. Sass project and edited by Sara Santos.

Introduction

The ESP32 can act as a server for a browser client and some models include a camera (for example, ESP32-CAM) which allows the client to view still or video pictures in the browser. HTML, JavaScript, and other browser languages can take advantage of the extensive capabilities of ESP32 and its camera.

For those who have little or no experience with the ESP32 camera development boards can start with the following tutorial.

ESP32-CAM Video Streaming and Face Recognition with Arduino IDE

OpenCV.js

As described in docs.opencv.org, “OpenCV.js is a JavaScript binding for a selected subset of OpenCV functions for the web platform”. OpenCV.js uses Emscripten, a LLVM-to-JavaScript compiler, to compile OpenCV functions for an API library which continues to grow.

OpenCV.js runs in a browser which allows rapid trial of OpenCV functions by someone with only a modest background in HTML and JavaScript. Those with a background in Esp32 Camera applications have this background already.

Project Overview

The project we’ll build throughout this tutorial creates a web server that allows color tracking of a moving object. On the web server interface, you play with several configurations to properly select the color you want to track. Then, the browser sends the real time x and y coordinates of the center of mass of the moving object to the ESP32 board.

Here’s a preview of the web server.

ESP32-CAM Color Tracking Web Server Preview

Prerequisites

Before proceeding with this project, make sure you follow the next pre-requisites.

Arduino IDE

We’ll program the ESP32 board using Arduino IDE. So, you need the Arduino IDE installed as well as the ESP32 add-on:

Installing the ESP32 Board in Arduino IDE (Windows, Mac OS X, Linux)

VS Code (optional)

If you prefer to use VS Code + PlatformIO to program your board, you can follow the next tutorial to learn how to set up VS Code to work with the ESP32 boards.

Getting Started with VS Code and PlatformIO IDE for ESP32

Getting an ESP32 Camera

This project is compatible with any ESP32 camera board that features an OV2640 camera. There are several ESP32 camera models out there. For a comparison of the most popular cameras, you can refer to the next article:

ESP32 Camera Dev Boards Review and Comparison (Best ESP32-CAM)

Make sure you know the pin assignment for the camera board you’re using. For the pin assignment of the most popular boards, check this article:

ESP32-CAM Camera Boards: Pin and GPIOs Assignment Guide

Code – ESP32-CAM with OpenCV.js

The program consists of two parts:

the server program which runs on the ESP32 Camera
the client program which runs on the Chrome browser

The program is split into two files: the OCV_ColorTrack_P.ino file containing the server program and the index_OCV_ColorTrack.h header file containing the client program (HTML, CSS and JavaScript with OpenCV.js).

Create a new Arduino sketch called OCV_ColorTrack_P and copy the following code.

/*********
  The include file, index_OCV_ColorTrack.h, the Client, is an intoduction of OpenCV.js to the ESP32 Camera environment. The Client was
  developed and written by Andrew R. Sass. Permission to reproduce the index_OCV_ColorTrack.h file is granted free of charge if this
  entire copyright notice is included in all copies of the index_OCV_ColorTrack.h file.
  
  Complete instructions at https://RandomNerdTutorials.com/esp32-cam-opencv-js-color-detection-tracking/
  
  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files.
  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
*********/

#include <WiFi.h>
#include <WiFiClientSecure.h>
#include "esp_camera.h"
#include "soc/soc.h"
#include "soc/rtc_cntl_reg.h"
#include "index_OCV_ColorTrack.h"

// Replace with your network credentials
const char* ssid = "REPLACE_WITH_YOUR_SSID";
const char* password = "REPLACE_WITH_YOUR_PASSWORD";
 
String Feedback="";
String Command="",cmd="",P1="",P2="",P3="",P4="",P5="",P6="",P7="",P8="",P9="";
byte ReceiveState=0,cmdState=1,strState=1,questionstate=0,equalstate=0,semicolonstate=0;
//ANN:0
//       AI-Thinker                    
#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27
#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

WiFiServer server(80);
//ANN:2
void ExecuteCommand() {
  if (cmd!="colorDetect") {  //Omit printout
    //Serial.println("cmd= "+cmd+" ,P1= "+P1+" ,P2= "+P2+" ,P3= "+P3+" ,P4= "+P4+" ,P5= "+P5+" ,P6= "+P6+" ,P7= "+P7+" ,P8= "+P8+" ,P9= "+P9);
    //Serial.println("");
  }
  
  if (cmd=="resetwifi") {
    WiFi.begin(P1.c_str(), P2.c_str());
    Serial.print("Connecting to ");
    Serial.println(P1);
    long int StartTime=millis();
    while (WiFi.status() != WL_CONNECTED) 
    {
        delay(500);
        if ((StartTime+5000) < millis()) break;
    } 
    Serial.println("");
    Serial.println("STAIP: "+WiFi.localIP().toString());
    Feedback="STAIP: "+WiFi.localIP().toString();
  }    
  else if (cmd=="restart") {
    ESP.restart();
  }
  else if (cmd=="cm"){
    int XcmVal = P1.toInt();
    int YcmVal = P2.toInt();
    Serial.println("cmd= "+cmd+" ,VALXCM= "+XcmVal);
    Serial.println("cmd= "+cmd+" ,VALYCM= "+YcmVal);   
  }
  else if (cmd=="quality") { 
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_quality(s, val);
  }
  else if (cmd=="contrast") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_contrast(s, val);
  }
  else if (cmd=="brightness") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt();  
    s->set_brightness(s, val);  
  }   
  else {
    Feedback="Command is not defined.";
  }
  if (Feedback=="") {
    Feedback=Command;
  }
}

void setup() {
  WRITE_PERI_REG(RTC_CNTL_BROWN_OUT_REG, 0);
  
  Serial.begin(115200);
  Serial.setDebugOutput(true);
  Serial.println();

  camera_config_t config;
  config.ledc_channel = LEDC_CHANNEL_0;
  config.ledc_timer = LEDC_TIMER_0;
  config.pin_d0 = Y2_GPIO_NUM;
  config.pin_d1 = Y3_GPIO_NUM;
  config.pin_d2 = Y4_GPIO_NUM;
  config.pin_d3 = Y5_GPIO_NUM;
  config.pin_d4 = Y6_GPIO_NUM;
  config.pin_d5 = Y7_GPIO_NUM;
  config.pin_d6 = Y8_GPIO_NUM;
  config.pin_d7 = Y9_GPIO_NUM;
  config.pin_xclk = XCLK_GPIO_NUM;
  config.pin_pclk = PCLK_GPIO_NUM;
  config.pin_vsync = VSYNC_GPIO_NUM;
  config.pin_href = HREF_GPIO_NUM;
  config.pin_sccb_sda = SIOD_GPIO_NUM;
  config.pin_sccb_scl = SIOC_GPIO_NUM;
  config.pin_pwdn = PWDN_GPIO_NUM;
  config.pin_reset = RESET_GPIO_NUM;
  config.xclk_freq_hz = 20000000;
  config.pixel_format = PIXFORMAT_JPEG;
  //init with high specs to pre-allocate larger buffers
  if(psramFound()){
    config.frame_size = FRAMESIZE_UXGA;
    config.jpeg_quality = 10;  //0-63 lower number means higher quality
    config.fb_count = 2;
  } else {
    config.frame_size = FRAMESIZE_SVGA;
    config.jpeg_quality = 12;  //0-63 lower number means higher quality
    config.fb_count = 1;
  }
  
  // camera init
  esp_err_t err = esp_camera_init(&config);
  if (err != ESP_OK) {
    Serial.printf("Camera init failed with error 0x%x", err);
    delay(1000);
    ESP.restart();
  }

  //drop down frame size for higher initial frame rate
  sensor_t * s = esp_camera_sensor_get();
  s->set_framesize(s, FRAMESIZE_CIF);  //UXGA|SXGA|XGA|SVGA|VGA|CIF|QVGA|HQVGA|QQVGA  設定初始化影像解析度
     
  WiFi.mode(WIFI_AP_STA);
  WiFi.begin(ssid, password);   

  delay(1000);

  long int StartTime=millis();
  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    if ((StartTime+10000) < millis()) 
      break;   
  } 

  if (WiFi.status() == WL_CONNECTED) {   
    Serial.print("ESP IP Address: http://");
    Serial.println(WiFi.localIP());  
  }
  server.begin();          
}



void loop() {
  Feedback="";Command="";cmd="";P1="";P2="";P3="";P4="";P5="";P6="";P7="";P8="";P9="";
  ReceiveState=0,cmdState=1,strState=1,questionstate=0,equalstate=0,semicolonstate=0;
  
  WiFiClient client = server.available();

  if (client) { 
    String currentLine = "";

    while (client.connected()) {
      if (client.available()) {
        char c = client.read();             
        
        getCommand(c);
                
        if (c == '\n') {
          if (currentLine.length() == 0) {    
            
            if (cmd=="colorDetect") {
              camera_fb_t * fb = NULL;
              fb = esp_camera_fb_get();  
              if(!fb) {
                Serial.println("Camera capture failed");
                delay(1000);
                ESP.restart();
              }
              //ANN:1
              client.println("HTTP/1.1 200 OK");
              client.println("Access-Control-Allow-Origin: *");              
              client.println("Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept");
              client.println("Access-Control-Allow-Methods: GET,POST,PUT,DELETE,OPTIONS");
              client.println("Content-Type: image/jpeg");
              client.println("Content-Disposition: form-data; name=\"imageFile\"; filename=\"picture.jpg\""); 
              client.println("Content-Length: " + String(fb->len));             
              client.println("Connection: close");
              client.println();
              
              uint8_t *fbBuf = fb->buf;
              size_t fbLen = fb->len;
              for (size_t n=0;n<fbLen;n=n+1024) {
                if (n+1024<fbLen) {
                  client.write(fbBuf, 1024);
                  fbBuf += 1024;
                }
                else if (fbLen%1024>0) {
                  size_t remainder = fbLen%1024;
                  client.write(fbBuf, remainder);
                }
              }    
              esp_camera_fb_return(fb);                        
            }
            else {
              //ANN:1
              client.println("HTTP/1.1 200 OK");
              client.println("Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept");
              client.println("Access-Control-Allow-Methods: GET,POST,PUT,DELETE,OPTIONS");
              client.println("Content-Type: text/html; charset=utf-8");
              client.println("Access-Control-Allow-Origin: *");
              client.println("Connection: close");
              client.println();           
              String Data="";
              if (cmd!="")
                Data = Feedback;
              else {
                Data = String((const char *)INDEX_HTML);
              }
              int Index;
              for (Index = 0; Index < Data.length(); Index = Index+1000) {
                client.print(Data.substring(Index, Index+1000));
              }        
              client.println();
            }
                        
            Feedback="";
            break;
          } else {
            currentLine = "";
          }
        } 
        else if (c != '\r') {
          currentLine += c;
        }
        if ((currentLine.indexOf("/?")!=-1)&&(currentLine.indexOf(" HTTP")!=-1)) {
          if (Command.indexOf("stop")!=-1) {  
            client.println();
            client.println();
            client.stop();
          }
          currentLine="";
          Feedback="";
          ExecuteCommand();
        }
      }
    }
    delay(1);
    client.stop();
  }
}

void getCommand(char c){
  if (c=='?') ReceiveState=1;
  if ((c==' ')||(c=='\r')||(c=='\n')) ReceiveState=0;
  
  if (ReceiveState==1) {
    Command=Command+String(c);    
    if (c=='=') cmdState=0;
    if (c==';') strState++;
    if ((cmdState==1)&&((c!='?')||(questionstate==1))) cmd=cmd+String(c);
    if ((cmdState==0)&&(strState==1)&&((c!='=')||(equalstate==1))) P1=P1+String(c);
    if ((cmdState==0)&&(strState==2)&&(c!=';')) P2=P2+String(c);
    if ((cmdState==0)&&(strState==3)&&(c!=';')) P3=P3+String(c);
    if ((cmdState==0)&&(strState==4)&&(c!=';')) P4=P4+String(c);
    if ((cmdState==0)&&(strState==5)&&(c!=';')) P5=P5+String(c);
    if ((cmdState==0)&&(strState==6)&&(c!=';')) P6=P6+String(c);
    if ((cmdState==0)&&(strState==7)&&(c!=';')) P7=P7+String(c);
    if ((cmdState==0)&&(strState==8)&&(c!=';')) P8=P8+String(c);
    if ((cmdState==0)&&(strState>=9)&&((c!=';')||(semicolonstate==1))) P9=P9+String(c);   
    if (c=='?') questionstate=1;
    if (c=='=') equalstate=1;
    if ((strState>=9)&&(c==';')) semicolonstate=1;
  }
}

View raw code

Save that file.

index_OCV_ColorTrack.h

Then, open a new tab in the Arduino IDE as shown in the following image.

Name it index_OCV_ColorTrack.h.

Copy the following into that file.

/****************************
  This include file, index_OCV_ColorTrack.h, the Client, is an intoduction of OpenCV.js to the ESP32 Camera environment. The Client was
  developed and written by Andrew R. Sass. Permission to reproduce the index_OCV_ColorTrack.h file is granted free of charge if this
  entire copyright notice is included in all copies of the index_OCV_ColorTrack.h file. 
  
  Complete instructions at https://RandomNerdTutorials.com/esp32-cam-opencv-js-color-detection-tracking/
*******************************/
static const char PROGMEM INDEX_HTML[] = R"rawliteral(
<!DOCTYPE html>
<html>
<head>
   <title>ESP32-CAMERA COLOR DETECTION</title>
   <meta charset="utf-8">
   <meta name="viewport" content="width=device-width,initial-scale=1">
   <!----ANN:3--->
   <script async src=" https://docs.opencv.org/master/opencv.js" type="text/javascript"></script>
</head>
<style>
html {
    font-family: Arial, Helvetica, sans-serif;
    }
body { 
    background-color: #F7F7F2;
    margin: 0px;
}
h1 {
    font-size: 1.6rem;
    color:white;
    text-align: center;
}
.topnav {
    overflow: hidden;
    background-color: #0A1128;
}
.main-controls{
  padding-top: 5px;
}
h2 {
    color: #0A1128;
    font-size: 1rem;
}    
.section {
    margin: 2px;
    padding: 10px;
}
.column{
    float: left;
    width: 50%
}
table {
    margin: 0;
    width: 90%;
    border-collapse: collapse;
}
th{
    text-align: center;
}
.row{
    margin-right:50px;
    margin-left:50px;
}

#colorDetect{ 
    border: none;
    color: #FEFCFB;
    background-color: #0A1128;
    padding: 15px;
    text-align: center;
    display: inline-block;
    font-size: 16px;
    border-radius: 4px;
}
#restart{
    border: none;
    color: #FEFCFB;
    background-color: #7B0828;
    padding: 15px;
    text-align: center;
    display: inline-block;
    font-size: 16px;
    border-radius: 4px;  
}
button{
    border: none;
    color: #FEFCFB;
    background-color: #0A1128;
    padding: 10px;
    text-align: center;
    display: inline-block;
    border-radius: 4px;    
}

</style>
<body>
    <div class="topnav">
        <h1>ESP32-CAM Color Detection and Tracking</h1>
    </div>
    <div class="main-controls">
        <table>
            <tr>
                <td><center><input type="button" id="colorDetect" value="COLOR DETECTION"></center></td> 
                <td><center><input type="button" id="restart" value="RESET BOARD"></center></td> 
            </tr>      
        </table>
    </div>
<div class="container">
  <div class = "row"> 
    <div class = "column"> 
        <div class="section">
            <div class ="video-container">
                <h2>Video Streaming</h2>   
                <center><img id="ShowImage" src="" style="display:none"></center>
                <center><canvas id="canvas" style="display:none"></canvas></center>
            </div>
        </div>
        <div class="section">
            <table>
              <tr>
                  <td>Quality</td>
                  <td><input type="range" id="quality" min="10" max="63" value="10"></td>
              </tr>
              <tr>
                  <td>Brightness</td>
                  <td><input type="range" id="brightness" min="-2" max="2" value="0"></td>
              </tr>
              <tr>
                  <td>Contrast</td>
                  <td><input type="range" id="contrast" min="-2" max="2" value="0"></td>
              </tr>
            </table>
        </div>
 
      <!-----ANN:5---->
      <div class="section">
        <h2>RGB Color Trackbars</h2>
        <table>
            <tr>
                <td>R min:&#160;&#160;&#160;<span id="RMINdemo"></span></td>
                <td><input type="range" id="rmin" min="0" max="255" value="0" class = "slider"></td>
                <td>R max:&#160;&#160;&#160;<span id="RMAXdemo"></span></td>
                <td><input type="range" id="rmax" min="0" max="255" value="50" class = "slider"></td>
            </tr>
            <tr>
                <td>G min:&#160;&#160;&#160;<span id="GMINdemo"></span></td>
                <td><input type="range" id="gmin" min="0" max="255" value="0" class = "slider"></td>
                <td>G max:&#160;&#160;&#160;<span id="GMAXdemo"></span></td>
                <td><input type="range" id="gmax" min="0" max="255" value="50" class = "slider"></td>
            </tr>
            <tr>
                <td>B min:&#160;&#160;&#160;<span id ="BMINdemo"></span></td>
                <td><input type="range" id="bmin" min="0" max="255" value="0" class = "slider">  </td>
                <td>B max:&#160;&#160;&#160;<span id="BMAXdemo"></span></td>
                <td> <input type="range" id="bmax" min="0" max="255" value="50" class = "slider">   </td>
            </tr>
        </table>
      </div>

      <div class="section">
        <h2>Threshold Minimum-Binary Image</h2>
        <table>
            <tr>
                <td>Minimum Threshold:&#160;&#160;&#160;<span id="THRESH_MINdemo"></span></td>
                <td><input type="range" id="thresh_min" min="0" max="255" value="120" class = "slider">  </td>
            </tr>
        </table>
    </div>
     <!----ANN:9---> 
     <div class="section">
        <h2>Color Probe</h2>
        <table>
            <tr>
                <td>X probe:&#160;&#160;&#160;<span id="X_PROBEdemo"></span></td>
                <td><input type="range" id="x_probe" min="0" max="400" value="200" class = "slider"></td>
                <td>Y probe:&#160;&#160;&#160;<span id="Y_PROBEdemo"></span></td>
                <td> <input type="range" id="y_probe" min="0" max="296" value="148" class = "slider"></td>
            </tr>
        </table>
      </div>
            
    </div>   <!------endfirstcolumn---------------->   
    
    <div class = "column">      
        <div class="section">
            <h2>Image Mask</h2>
            <canvas id="imageMask"></canvas>
        </div>
        <div class="section">
            <h2>Image Canvas</h2>
            <canvas id="imageCanvas"></canvas>
        </div>
        <div class="section">
            <table>
                <tr>
                    <td><button type="button" id="invertButton" class="btn btn-primary">INVERT</button></td>
                    <td><button type="button" id="contourButton" class="btn btn-primary">SHOW CONTOUR</button></td>
                    <td><button type="button" id="trackButton" class="btn btn-primary">TRACKING</button></td>
                </tr>
                <tr>
                    <td>Invert: <span id="INVERTdemo"></span></td>
                    <td>Contour: <span id="CONTOURdemo"></span></td>
                    <td>Track: <span id="TRACKdemo"></span>
                    </td>
                </tr>
            </table>
        </div>
        <div class="section">
            <table>
                <tr>
                    <td><strong>XCM:</strong> <span id="XCMdemo"></span></td>
                    <td><strong>YCM:</strong> <span id="YCMdemo"></span></td>
                </tr>
            </table>
        </div>
        
        <div class="section">
            <canvas id="textCanvas" width="480" height="180" style= "border: 1px solid #black;"></canvas>
            <iframe id="ifr" style="display:none"></iframe>
            <div id="message"></div>  
        </div>             
        </div>  <!------end2ndcolumn------------------------>
  </div>   <!-----endrow---------------------->   
</div>   <!------endcontainer-------------->
 <!--------------- </body>----------------->
 <!----------------</html>----------------->
<div class="modal"></div>
<script>
var colorDetect = document.getElementById('colorDetect');
var ShowImage = document.getElementById('ShowImage');
var canvas = document.getElementById("canvas");
var context = canvas.getContext("2d");
var imageMask = document.getElementById("imageMask");
var imageMaskContext = imageMask.getContext("2d"); 
var imageCanvas = document.getElementById("imageCanvas");
var imageContext = imageCanvas.getContext("2d"); 
var txtcanvas = document.getElementById("textCanvas");
var ctx = txtcanvas.getContext("2d");  
var message = document.getElementById('message');
var ifr = document.getElementById('ifr');
var myTimer;
var restartCount=0;
const modelPath = 'https://ruisantosdotme.github.io/face-api.js/weights/';
let currentStream;
let displaySize = { width:400, height: 296 }
let faceDetection;

let b_tracker = false;
let x_cm = 0;
let y_cm = 0;

let b_invert = false;

let b_contour = false;

var RMAX=50;
var RMIN=0;
var GMAX=50;
var GMIN=0;
var BMAX=50;
var BMIN=0;
var THRESH_MIN=120;
var X_PROBE=200;
var Y_PROBE=196;
var R=0;
var G=0;
var B=0;
var A=0;


colorDetect.onclick = function (event) {
  clearInterval(myTimer);  
  myTimer = setInterval(function(){error_handle();},5000);
  ShowImage.src=location.origin+'/?colorDetect='+Math.random();
}

//ANN:READY
var Module = {
  onRuntimeInitialized(){onOpenCvReady();}
}

function onOpenCvReady(){
  //alert("onOpenCvReady");
  console.log("OpenCV IS READY!!!");
  drawReadyText();  
  document.body.classList.remove("loading");
}

    
function error_handle() {
  restartCount++;
  clearInterval(myTimer);
  if (restartCount<=2) {
    message.innerHTML = "Get still error. <br>Restart ESP32-CAM "+restartCount+" times.";
    myTimer = setInterval(function(){colorDetect.click();},10000);
    ifr.src = document.location.origin+'?restart';
  }
  else
    message.innerHTML = "Get still error. <br>Please close the page and check ESP32-CAM.";
}    
colorDetect.style.display = "block";
ShowImage.onload = function (event) {
  //alert("SHOW IMAGE");
  console.log("SHOW iMAGE");
  clearInterval(myTimer);
  restartCount=0;      
  canvas.setAttribute("width", ShowImage.width);
  canvas.setAttribute("height", ShowImage.height);
  canvas.style.display = "block";
  imageCanvas.setAttribute("width", ShowImage.width);
  imageCanvas.setAttribute("height", ShowImage.height);
  imageCanvas.style.display = "block";

  imageMask.setAttribute("width", ShowImage.width);
  imageMask.setAttribute("height", ShowImage.height);
  imageMask.style.display = "block";      
      
  context.drawImage(ShowImage,0,0,ShowImage.width,ShowImage.height);
  
  DetectImage();        
}
restart.onclick = function (event) {
  fetch(location.origin+'/?restart=stop');
}
quality.onclick = function (event) {
  fetch(document.location.origin+'/?quality='+this.value+';stop');
} 
brightness.onclick = function (event) {
  fetch(document.location.origin+'/?brightness='+this.value+';stop');
} 
contrast.onclick = function (event) {
  fetch(document.location.origin+'/?contrast='+this.value+';stop');
}                             
async function DetectImage() {
  //alert("DETECT IMAGE");
  console.log("DETECT IMAGE");

  /***************opencv********************************/
  //ANN:4
  let src = cv.imread(ShowImage);
  arows = src.rows;
  acols = src.cols;
  aarea = arows*acols;
  adepth = src.depth();
  atype = src.type();
  achannels = src.channels();
  console.log("rows = " + arows);
  console.log("cols = " + acols);
  console.log("pic area = " + aarea);
  console.log("depth = " + adepth); 
  console.log("type = " + atype); 
  console.log("channels = " + achannels);
  
  /******************COLOR DETECT******************************/

  //ANN:6
  var RMAXslider = document.getElementById("rmax");
  var RMAXoutput = document.getElementById("RMAXdemo");
  RMAXoutput.innerHTML = RMAXslider.value;
  RMAXslider.oninput = function(){
  RMAXoutput.innerHTML = this.value;
  RMAX = parseInt(RMAXoutput.innerHTML,10);
  console.log("RMAX=" + RMAX);
  }

  console.log("RMAX=" + RMAX);

  var RMINslider = document.getElementById("rmin");
  var RMINoutput = document.getElementById("RMINdemo");
  RMINoutput.innerHTML = RMINslider.value;
  RMINslider.oninput = function(){
    RMINoutput.innerHTML = this.value;
    RMIN = parseInt(RMINoutput.innerHTML,10);
    console.log("RMIN=" + RMIN);
  }
  console.log("RMIN=" + RMIN);

  var GMAXslider = document.getElementById("gmax");
  var GMAXoutput = document.getElementById("GMAXdemo");
  GMAXoutput.innerHTML = GMAXslider.value;
  GMAXslider.oninput = function(){
    GMAXoutput.innerHTML = this.value;
    GMAX = parseInt(GMAXoutput.innerHTML,10);
  }
  console.log("GMAX=" + GMAX);

  var GMINslider = document.getElementById("gmin");
  var GMINoutput = document.getElementById("GMINdemo");
  GMINoutput.innerHTML = GMINslider.value;
  GMINslider.oninput = function(){
    GMINoutput.innerHTML = this.value;
    GMIN = parseInt(GMINoutput.innerHTML,10);
  }
  console.log("GMIN=" + GMIN);

  var BMAXslider = document.getElementById("bmax");
  var BMAXoutput = document.getElementById("BMAXdemo");
  BMAXoutput.innerHTML = BMAXslider.value;
  BMAXslider.oninput = function(){
    BMAXoutput.innerHTML = this.value;
    BMAX = parseInt(BMAXoutput.innerHTML,10);
  }
  console.log("BMAX=" + BMAX);

  var BMINslider = document.getElementById("bmin");
  var BMINoutput = document.getElementById("BMINdemo");
  BMINoutput.innerHTML = BMINslider.value;
  BMINslider.oninput = function(){
  BMINoutput.innerHTML = this.value;
  BMIN = parseInt(BMINoutput.innerHTML,10);
  }
  console.log("BMIN=" + BMIN);



  var THRESH_MINslider = document.getElementById("thresh_min");
  var THRESH_MINoutput = document.getElementById("THRESH_MINdemo");
  THRESH_MINoutput.innerHTML = THRESH_MINslider.value;
  THRESH_MINslider.oninput = function(){
  THRESH_MINoutput.innerHTML = this.value;
  THRESH_MIN = parseInt(THRESH_MINoutput.innerHTML,10);
  }
  console.log("THRESHOLD MIN=" + THRESH_MIN);

  //ANN:9A
  var X_PROBEslider = document.getElementById("x_probe");
  var X_PROBEoutput = document.getElementById("X_PROBEdemo");
  X_PROBEoutput.innerHTML = X_PROBEslider.value;
  X_PROBEslider.oninput = function(){
  X_PROBEoutput.innerHTML = this.value;
  X_PROBE = parseInt(X_PROBEoutput.innerHTML,10);
  }
  console.log("X_PROBE=" + X_PROBE); 

  var Y_PROBEslider = document.getElementById("y_probe");
  var Y_PROBEoutput = document.getElementById("Y_PROBEdemo");
  Y_PROBEoutput.innerHTML = Y_PROBEslider.value;
  Y_PROBEslider.oninput = function(){
  Y_PROBEoutput.innerHTML = this.value;
  Y_PROBE = parseInt(Y_PROBEoutput.innerHTML,10);
  }
  console.log("Y_PROBE=" + Y_PROBE); 


  document.getElementById('trackButton').onclick = function(){
    b_tracker = (true && !b_tracker)  
    console.log("TRACKER = " + b_tracker );
    var TRACKoutput = document.getElementById("TRACKdemo");
    TRACKoutput.innerHTML = b_tracker;
    //var XCMoutput = document.getElementById("XCMdemo");
    //XCMoutput.innerHTML = x_cm;
 
  }  

  document.getElementById('invertButton').onclick = function(){
    b_invert = (true && !b_invert)  
    console.log("TRACKER = " + b_invert );
    var INVERToutput = document.getElementById("INVERTdemo");
    INVERToutput.innerHTML = b_invert;
  }  
/**/
  document.getElementById('contourButton').onclick = function(){
    b_contour = (true && !b_contour)  
    console.log("TRACKER = " + b_contour );
    var CONTOURoutput = document.getElementById("CONTOURdemo");
    CONTOURoutput.innerHTML = b_contour;
  } 
/**/ 

  let tracker = 0;
  
  var TRACKoutput = document.getElementById("TRACKdemo");
  TRACKoutput.innerHTML = b_tracker;
  var XCMoutput = document.getElementById("XCMdemo");
  var YCMoutput = document.getElementById("YCMdemo");

  XCMoutput.innerHTML = 0;
  YCMoutput.innerHTML = 0; 

  var INVERToutput = document.getElementById("INVERTdemo");
  INVERToutput.innerHTML = b_invert;  

  var CONTOURoutput = document.getElementById("CONTOURdemo");
  CONTOURoutput.innerHTML = b_contour;   

  //ANN:8
  let M00Array = [0,];
  let orig = new cv.Mat();
  let mask = new cv.Mat();
  let mask1 = new cv.Mat();
  let mask2 = new cv.Mat();
  let contours = new cv.MatVector();
  let hierarchy = new cv.Mat();
  let rgbaPlanes = new cv.MatVector();
    
  let color = new cv.Scalar(0,0,0);

  clear_canvas();


    
  orig = cv.imread(ShowImage);
  cv.split(orig,rgbaPlanes);  //SPLIT
  let BP = rgbaPlanes.get(2);  // SELECTED COLOR PLANE
  let GP = rgbaPlanes.get(1);
  let RP = rgbaPlanes.get(0);
  cv.merge(rgbaPlanes,orig);
   
    
              //   BLK    BLU   GRN   RED
  let row = Y_PROBE //180//275 //225 //150 //130    
  let col = X_PROBE //100//10 //100 //200 //300
  drawColRowText(acols,arows);


  console.log("ISCONTINUOUS = " + orig.isContinuous());

  //ANN:9C
  R = src.data[row * src.cols * src.channels() + col * src.channels()];
  G = src.data[row * src.cols * src.channels() + col * src.channels() + 1];
  B = src.data[row * src.cols * src.channels() + col * src.channels() + 2];
  A = src.data[row * src.cols * src.channels() + col * src.channels() + 3];
  console.log("RDATA = " + R);
  console.log("GDATA = " + G);
  console.log("BDATA = " + B);
  console.log("ADATA = " + A);

  drawRGB_PROBE_Text();
  
   
    
  //ANN:9b
  //*************draw probe point*********************
  let point4 = new cv.Point(col,row);
  cv.circle(src,point4,5,[255,255,255,255],2,cv.LINE_AA,0);
  //***********end draw probe point*********************

  //ANN:7
  let high = new cv.Mat(src.rows,src.cols,src.type(),[RMAX,GMAX,BMAX,255]);
  let low = new cv.Mat(src.rows,src.cols,src.type(),[RMIN,GMIN,BMIN,0]);

  cv.inRange(src,low,high,mask1);
  //inRange(source image, lower limit, higher limit, destination image)
    
  cv.threshold(mask1,mask,THRESH_MIN,255,cv.THRESH_BINARY);
  //threshold(source image,destination image,threshold,255,threshold method);

  //ANN:9
  if(b_invert==true){
     cv.bitwise_not(mask,mask2);
  }
/********************start contours******************************************/
  //ANN:10
  if(b_tracker == true){
  try{
   if(b_invert==false){
    //ANN:11   
    cv.findContours(mask,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
    //findContours(source image, array of contours found, hierarchy of contours
        // if contours are inside other contours, method of contour data retrieval,
        //algorithm method)
   }
   else{
    cv.findContours(mask2,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
   }
    console.log("CONTOUR_SIZE = " + contours.size());

    //draw contours
    if(b_contour==true){
     for(let i = 0; i < contours.size(); i++){
        cv.drawContours(src,contours,i,[0,0,0,255],2,cv.LINE_8,hierarchy,100)
     }
    }

    //ANN:12
    let cnt;
    let Moments;
    let M00;
    let M10;
    //let x_cm;
    //let y_cm;
    
    //ANN:13
    for(let k = 0; k < contours.size(); k++){
        cnt = contours.get(k); 
        Moments = cv.moments(cnt,false);
        M00Array[k] = Moments.m00;
       // cnt.delete();
    }

    //ANN13A
    let max_area_arg = MaxAreaArg(M00Array);
    console.log("MAXAREAARG = "+max_area_arg);

    //let TestArray = [0,0,0,15,4,15,2];
    //let TestArray0 = [];
    //let max_test_area_arg = MaxAreaArg(TestArray0);
    //console.log("MAXTESTAREAARG = "+max_test_area_arg);



    let ArgMaxArea = MaxAreaArg(M00Array);
    if(ArgMaxArea >= 0){
    cnt = contours.get(MaxAreaArg(M00Array));  //use the contour with biggest MOO
    //cnt = contours.get(54);
    Moments = cv.moments(cnt,false);
    M00 = Moments.m00;
    M10 = Moments.m10;
    M01 = Moments.m01;
    x_cm = M10/M00;    // 75 for circle_9.jpg
    y_cm = M01/M00;    // 41 for circle_9.jpg

    XCMoutput.innerHTML = Math.round(x_cm);
    YCMoutput.innerHTML = Math.round(y_cm);

    console.log("M00 = "+M00);  
    console.log("XCM = "+Math.round(x_cm));
    console.log("YCM = "+Math.round(y_cm)); 

    //fetch(document.location.origin+'/?xcm='+Math.round(x_cm)+';stop');
    fetch(document.location.origin+'/?cm='+Math.round(x_cm)+';'+Math.round(y_cm)+';stop');

    console.log("M00ARRAY = " + M00Array);

    //ANN:14   
    
    //**************min area bounding rect********************
    //let rotatedRect=cv.minAreaRect(cnt);
    //let vertices = cv.RotatedRect.points(rotatedRect);

    //for(let j=0;j<4;j++){
    //    cv.line(src,vertices[j],
    //        vertices[(j+1)%4],[0,0,255,255],2,cv.LINE_AA,0);
    //}
    //***************end min area bounding rect*************************************


    //***************bounding rect***************************
    let rect = cv.boundingRect(cnt);
    let point1 = new cv.Point(rect.x,rect.y);
    let point2 = new cv.Point(rect.x+rect.width,rect.y+rect.height);

    cv.rectangle(src,point1,point2,[0,0,255,255],2,cv.LINE_AA,0);
    //*************end bounding rect***************************


    //*************draw center point*********************
    let point3 = new cv.Point(x_cm,y_cm);
    cv.circle(src,point3,2,[0,0,255,255],2,cv.LINE_AA,0);
    //***********end draw center point*********************

    }//end if(ArgMaxArea >= 0)
    else{
      if(ArgMaxArea==-1){ 
        console.log("ZERO ARRAY LENGTH");
      }
      else{              //ArgMaxArea=-2
        console.log("DUPLICATE MAX ARRAY-ELEMENT");
      }
    }




    cnt.delete();
/******************end contours  note cnt line one up*******************************************/
   drawXCM_YCM_Text();

  }//end try
  catch{
    console.log("ERROR TRACKER NO CONTOUR");
    clear_canvas();
    drawErrorTracking_Text();
  }
    
  }//end b_tracking if statement
  else{
      XCMoutput.innerHTML = 0;
      YCMoutput.innerHTML = 0;
  }    

  if(b_invert==false){
     cv.imshow('imageMask', mask);
  }
  else{
     cv.imshow('imageMask', mask2);
  }
  //cv.imshow('imageMask', R);
  cv.imshow('imageCanvas', src);

  //ANN:8A
  src.delete();
  high.delete();
  low.delete();
  orig.delete();
  mask1.delete();
  mask2.delete();
  mask.delete();
  contours.delete();
  hierarchy.delete();
  //cnt.delete();
  RP.delete();
    
  


 /********************END COLOR DETECT****************************/
  
/***************end opencv******************************/
      

 setTimeout(function(){colorDetect.click();},500);
  
}//end detectimage 


function MaxAreaArg(arr){
    if (arr.length == 0) {
        return -1;
    }

    var max = arr[0];
    var maxIndex = 0;
    var dupIndexCount = 0; //duplicate max elements?

    if(arr[0] >= .90*aarea){
        max = 0;
    }

    for (var i = 1; i < arr.length; i++) {
        if (arr[i] > max && arr[i] < .99*aarea) {
            maxIndex = i;
            max = arr[i];
            dupIndexCount = 0;
        }
        else if(arr[i]==max && arr[i]!=0){
            dupIndexCount++;
        }
    }

    if(dupIndexCount==0){
        return maxIndex;
    }

    else{
        return -2;
    }        
}//end MaxAreaArg    



function clear_canvas(){
    ctx.clearRect(0,0,txtcanvas.width,txtcanvas.height);
    ctx.rect(0,0,txtcanvas.width,txtcanvas.height);
    ctx.fillStyle="red";
    ctx.fill();
}

function drawReadyText(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('OpenCV.JS READY',txtcanvas.width/4,txtcanvas.height/10);
}          

function drawColRowText(x,y){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('ImageCols='+x,0,txtcanvas.height/10);
    ctx.fillText('ImageRows='+y,txtcanvas.width/2,txtcanvas.height/10);
} 

function drawRGB_PROBE_Text(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('Rp='+R,0,2*txtcanvas.height/10);
    ctx.fillText('Gp='+G,txtcanvas.width/4,2*txtcanvas.height/10);
    ctx.fillText('Bp='+B,txtcanvas.width/2,2*txtcanvas.height/10);
    ctx.fillText('Ap='+A,3*txtcanvas.width/4,2*txtcanvas.height/10);
}

function drawXCM_YCM_Text(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('XCM='+Math.round(x_cm),0,3*txtcanvas.height/10); 
    ctx.fillText('YCM='+Math.round(y_cm),txtcanvas.width/4,3*txtcanvas.height/10);    
}

function drawErrorTracking_Text(){
    ctx.fillStyle = 'black';
    ctx.font = '20px serif';
    ctx.fillText('ERROR TRACKING-NO CONTOUR',0,3*txtcanvas.height/10);
}          
         
  </script> 
</body>
</html>  
)rawliteral";

View raw code

Save the file.

Network Credentials

For the program to work properly, you need to insert your network credentials in the following variables in the OCV_ColorTrack_P.ino file:

const char* ssid = "REPLACE_WITH_YOUR_SSID";
const char* password = "REPLACE_WITH_YOUR_PASSWORD";

Camera Pin Assignment

By default, the code uses the pin assignment for the ESP32-CAM AI-Thinker module.

#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27
#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

If you’re using a different camera board, don’t forget to insert the right pin assignment, you can go to the following article to find the pinout for your board:

ESP32-CAM Camera Boards: Pin and GPIOs Assignment Guide

How the Code Works

Continue reading to learn how the code works, or skip to the next section.

In order to facilitate reading and understanding the program, ANNOTATIONS have been added as comments in the program.

For example, wiring for ESP32-CAM is listed below the ANN:0 annotation, located in the .ino file. ANN:0 is found with the Edit/Find command of the Arduino IDE.

#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27
#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

Server Sketch

The server program, OCV_ColorTrack.ino, is taken from ESP32-CAM Projects, Module 5 by Rui Santos and Sara Santos. It has a standard ESP32 Camera setup() which configures the Camera and server IP address and password.

Annotation 1 (ANN:1)

However, what is not standard, in this server program are instructions of vital importance, which allow Access-Control. See code at ANN:1.

//ANN:1
client.println("HTTP/1.1 200 OK");
client.println("Access-Control-Allow-Origin: *");              
client.println("Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept");
client.println("Access-Control-Allow-Methods: GET,POST,PUT,DELETE,OPTIONS");
client.println("Content-Type: image/jpeg");
client.println("Content-Disposition: form-data; name=\"imageFile\"; filename=\"picture.jpg\""); 
client.println("Content-Length: " + String(fb->len));             
client.println("Connection: close");
client.println();

This instructs the browser to allow the Camera image and OpenCV.js, which have different origins to work together in the program. Without these instructions, the Chrome browser throws errors.

Annotation 2 (ANN:2)

The server loop() monitors client messages and decodes them via an ExecuteCommand() found at ANN:2.

//ANN:2
void ExecuteCommand() {
  if (cmd!="colorDetect") {  //Omit printout
    //Serial.println("cmd= "+cmd+" ,P1= "+P1+" ,P2= "+P2+" ,P3= "+P3+" ,P4= "+P4+" ,P5= "+P5+" ,P6= "+P6+" ,P7= "+P7+" ,P8= "+P8+" ,P9= "+P9);
    //Serial.println("");
  }
  
  if (cmd=="resetwifi") {
    WiFi.begin(P1.c_str(), P2.c_str());
    Serial.print("Connecting to ");
    Serial.println(P1);
    long int StartTime=millis();
    while (WiFi.status() != WL_CONNECTED) 
    {
        delay(500);
        if ((StartTime+5000) < millis()) break;
    } 
    Serial.println("");
    Serial.println("STAIP: "+WiFi.localIP().toString());
    Feedback="STAIP: "+WiFi.localIP().toString();
  }    
  else if (cmd=="restart") {
    ESP.restart();
  }
  else if (cmd=="cm"){
    int XcmVal = P1.toInt();
    int YcmVal = P2.toInt();
    Serial.println("cmd= "+cmd+" ,VALXCM= "+XcmVal);
    Serial.println("cmd= "+cmd+" ,VALYCM= "+YcmVal);   
  }
  else if (cmd=="quality") { 
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_quality(s, val);
  }
  else if (cmd=="contrast") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt(); 
    s->set_contrast(s, val);
  }
  else if (cmd=="brightness") {
    sensor_t * s = esp_camera_sensor_get();
    int val = P1.toInt();  
    s->set_brightness(s, val);  
  }   
  else {
    Feedback="Command is not defined.";
  }
  if (Feedback=="") {
    Feedback=Command;
  }
}

The original program uses this function to receive and execute sliders in the client which controls image characteristics and which are transmitted by the client via a “fetch” instruction.

In our current program, this feature, to be described, is used to communicate the “center-of-mass” of a color target detected by the client to the ESP32 server; a feature vital to a robotics application.

Other than the change of extracting the x and y center-of-mass and printing it, there are no other changes to the server program.

Client Sketch (OpenCV.js)

Other that the image-characteristics-sliders-routines and their data transmission to the server via “fetch” and the error routine in the original client program eBook refenced above, the client program here is new and contains code devoted to the application of OpenCV.js to the ESP32 Camera image transmitted to the browser (as mentioned previously, a “fetch” is used to transmit color target data to the server).

The client code is sprinkled liberally with console.log instructions which allow the user to see the results of the code. Chrome console.log is accessed by pressing CTRL+SHIFT+J simultaneously.

Annotation 3 (ANN:3)

ANN:3 includes the latest version of OpenCV.js in our web page. Click here to learn more.

<script async src=" https://docs.opencv.org/master/opencv.js" type="text/javascript"></script>

ANN:READY

ANN:READY marks the module which signals that OpenCV.js has been initialized. Once initialization is complete, the Color Detection button can be clicked. While faster computers do not require this capability, it is included for the sake of completeness.

Annotation 4 (ANN:4)

The screenshot of the client program, running on Chrome shows two columns as created by the HTML section of the code. The left column shows the original image of the camera which is transmitted at approximately 1 fps. This image, with an ID of ShowImage is the source image of the OpenCV code routine in the program.

ANN:4 marks the creation of the src and its characteristics; rows, cols, etc.

//ANN:4
let src = cv.imread(ShowImage);
arows = src.rows;
acols = src.cols;
aarea = arows*acols;
adepth = src.depth();
atype = src.type();
achannels = src.channels();
console.log("rows = " + arows);
console.log("cols = " + acols);
console.log("pic area = " + aarea);
console.log("depth = " + adepth); 
console.log("type = " + atype); 
console.log("channels = " + achannels);

RGB Color Trackbars

Below the source image are the original three image-characteristics sliders (Quality, Brightness and Contrast), there are RGB Color Trackbars.

ESP32-CAM Web Server Color Tracking RGB Trackbars OpenCVJS

These are used to set limits to the color range of colors allowed in the “processed” image in the CV application. Code for the trackbars are found at ANN:5, ANN:6.

<!-----ANN:5---->
<div class="section">
<h2>RGB Color Trackbars</h2>
<table>
  <tr>
    <td>R min:&#160;&#160;&#160;<span id="RMINdemo"></span></td>
    <td><input type="range" id="rmin" min="0" max="255" value="0" class = "slider"></td>
    <td>R max:&#160;&#160;&#160;<span id="RMAXdemo"></span></td>
    <td><input type="range" id="rmax" min="0" max="255" value="50" class = "slider"></td>
  </tr>
  <tr>
    <td>G min:&#160;&#160;&#160;<span id="GMINdemo"></span></td>
    <td><input type="range" id="gmin" min="0" max="255" value="0" class = "slider"></td>
    <td>G max:&#160;&#160;&#160;<span id="GMAXdemo"></span></td>
    <td><input type="range" id="gmax" min="0" max="255" value="50" class = "slider"></td>
  </tr>
  <tr>
    <td>B min:&#160;&#160;&#160;<span id ="BMINdemo"></span></td>
    <td><input type="range" id="bmin" min="0" max="255" value="0" class = "slider"></td>
    <td>B max:<span id="BMAXdemo"></span></td>
    <td> <input type="range" id="bmax" min="0" max="255" value="50" class = "slider"></td>
  </tr>
</table>
</div>

//ANN:6
var RMAXslider = document.getElementById("rmax");
var RMAXoutput = document.getElementById("RMAXdemo");
RMAXoutput.innerHTML = RMAXslider.value;
RMAXslider.oninput = function() {
  RMAXoutput.innerHTML = this.value;
  RMAX = parseInt(RMAXoutput.innerHTML,10);
  console.log("RMAX=" + RMAX);
}
console.log("RMAX=" + RMAX);

var RMINslider = document.getElementById("rmin");
var RMINoutput = document.getElementById("RMINdemo");
RMINoutput.innerHTML = RMINslider.value;
RMINslider.oninput = function(){
RMINoutput.innerHTML = this.value;
  RMIN = parseInt(RMINoutput.innerHTML,10);
  console.log("RMIN=" + RMIN);
}
console.log("RMIN=" + RMIN);

var GMAXslider = document.getElementById("gmax");
var GMAXoutput = document.getElementById("GMAXdemo");
GMAXoutput.innerHTML = GMAXslider.value;
GMAXslider.oninput = function(){
  GMAXoutput.innerHTML = this.value;
  GMAX = parseInt(GMAXoutput.innerHTML,10);
}
console.log("GMAX=" + GMAX);

var GMINslider = document.getElementById("gmin");
var GMINoutput = document.getElementById("GMINdemo");
GMINoutput.innerHTML = GMINslider.value;
GMINslider.oninput = function(){
  GMINoutput.innerHTML = this.value;
  GMIN = parseInt(GMINoutput.innerHTML,10);
}
console.log("GMIN=" + GMIN);

var BMAXslider = document.getElementById("bmax");
var BMAXoutput = document.getElementById("BMAXdemo");
BMAXoutput.innerHTML = BMAXslider.value;
BMAXslider.oninput = function(){
  BMAXoutput.innerHTML = this.value;
  BMAX = parseInt(BMAXoutput.innerHTML,10);
}
console.log("BMAX=" + BMAX);

var BMINslider = document.getElementById("bmin");
var BMINoutput = document.getElementById("BMINdemo");
BMINoutput.innerHTML = BMINslider.value;
BMINslider.oninput = function(){
  BMINoutput.innerHTML = this.value;
  BMIN = parseInt(BMINoutput.innerHTML,10);
}
console.log("BMIN=" + BMIN);

The maximum and minimum values of red, green, and blue (RGB) are applied to the OpenCV function, inRange(), at ANN:7.

let high = new cv.Mat(src.rows,src.cols,src.type(),[RMAX,GMAX,BMAX,255]);
let low = new cv.Mat(src.rows,src.cols,src.type(),[RMIN,GMIN,BMIN,0]);

cv.inRange(src,low,high,mask1);
//inRange(source image, lower limit, higher limit, destination image)
    
cv.threshold(mask1,mask,THRESH_MIN,255,cv.THRESH_BINARY);
//threshold(source image,destination image,threshold,255,threshold method);

The image is 4channel; RGBA where A is the level of transparency. In this tutorial, A will set set at 100% opacity, namely 255. The code is based on the fact that, besides the A plane, the image has 3 color planes, RGB, each pixel in each plane having a value between 0 and 255. The high/low limits are applied to the corresponding color planes for each pixel.

Note that inRange() has a destination image which has been created previously in the program (ANN:8).

let M00Array = [0,];
let orig = new cv.Mat();
let mask = new cv.Mat();
let mask1 = new cv.Mat();
let mask2 = new cv.Mat();
let contours = new cv.MatVector();
let hierarchy = new cv.Mat();
let rgbaPlanes = new cv.MatVector();
    
let color = new cv.Scalar(0,0,0);

clear_canvas();

orig = cv.imread(ShowImage);
cv.split(orig,rgbaPlanes);  //SPLIT
let BP = rgbaPlanes.get(2);  // SELECTED COLOR PLANE
let GP = rgbaPlanes.get(1);
let RP = rgbaPlanes.get(0);
cv.merge(rgbaPlanes,orig);

Important: every image created in an OpenCV program has to be deleted to avoid computer memory leakage (ANN:8A).

src.delete();
high.delete();
low.delete();
orig.delete();
mask1.delete();
mask2.delete();
mask.delete();
contours.delete();
hierarchy.delete();
//cnt.delete();
RP.delete();

The destination image Mask1 is not shown in the program although it could be. However, it is used by the threshold() function immediately following inRange().

The threshold() function examines the composite source image pixel value and sets the corresponding destination value at either 0 or 255 depending on whether the source value is less or greater than the threshold value. The top image in the right hand column shows this binary image.

For the sake of completeness, an invert capability has been added to the binary image. When the INVERT button in the web page is clicked, the binary image is inverted (black becomes white, white becomes black) and subsequent processing is performed on the new image. The button is bistable, so that a second push returns the binary image to its original state.

Target-color Probe

ESP32-CAM Color Tracking Color Probe OpenCVJS

In the screenshot, a red cap is the target in an ordinary room environment with an ordinary 60W fluorescent lamp. The lamp emits red, green, and blue. The red cap reflects red, green, and blue but principally red. The method of detecting the amount of each reflected color will be described now. This method allows the RGB trackbars to be set with minimal effort. Its use is strongly advised.

ESP32-CAM OpenCVJS Original Image Mask and Color Tracking — Image courtesy of Andrew R. Sass

The method involves using the Color Probe sliders. These two sliders, X and Y Probe are used to place a small white circle probe in a desired position in the bottom image of the right-hand column. The RGB values in this probe position are measured and used to set the inRange() RGB maximums and minimums described previously.

See ANN:9,9A,9B,9C for the code associated with this probe.

When the optimum values for a desired target are found using the X, Y probe and set by the trackbar, the target in the binary image is white and the remainder of the image is black, ideally, as shown in the screenshot.

This ideal typically can be realized only when lighting conditions can be closely controlled. Indoor, standard room lighting is acceptable. Filters can be used for optimal results but were not used here.

Here’s another example:

Tracking

Once the binary image is deemed acceptable, the TRACKING button, which is bistable, can be clicked. ANN:10 marks the beginning of the tracking routine.

//ANN:10
if(b_tracker == true){
try{
 if(b_invert==false){

Since, as mentioned above, this article is not concerned with the INVERT capability, only the b_invert equal to false is of interest

ANN:11 The first step in the tracking is findContours, which is the OpenCV algorithm which finds the contours of all the white objects in the binary image.

//ANN:11   
    cv.findContours(mask,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
//findContours(source image, array of contours found, hierarchy of contours
// if contours are inside other contours, method of contour data retrieval,
//algorithm method)
}
else{
  cv.findContours(mask2,contours,hierarchy,cv.RETR_CCOMP,cv.CHAIN_APPROX_SIMPLE);
}
console.log("CONTOUR_SIZE = " + contours.size());

//draw contours
if(b_contour==true){
  for(let i = 0; i < contours.size(); i++){
    cv.drawContours(src,contours,i,[0,0,0,255],2,cv.LINE_8,hierarchy,100)
  }
}

If the tracking button is pressed when the binary image is fully black, the instructions depending on findContours output will throw exceptions; the try-catch allows the program to continue safely, posting an output in the console log and the text box.

Contours.sizecontours is the output of findContours and is an array of contours of the white object(s) found in the binary image. Contours.size() finds the number of elements in the array. The hierarchy (contours inside other contours) output is not of concern here as there will be no white objects (outlined in black) inside other white objects.

ANN:12 Marks the beginning of finding the moments of the contours found.

/ANN:12
let cnt;
let Moments;
let M00;
let M10;

M00 is the zeroth moment-the “area” enclosed by a contour. In OpenCv it is actually the number of pixels enclosed by the contour. M10 and M01 are the x and y coordinate-weighted number of pixels enclosed.

As usual, the origin of the x,y coordinate system is at the upper left corner of the image. X is positive horizontal to the right and Y is positive vertical down. Therefore M10/M00 and M01/M00 are the x,y coordinates of the centroid of a contour in the array.

ANN:13,13A marks finding the largest area contour in the array of contours using the MaxAreaArg function and transmitting the centroid, x_cm, y_cm to the ESP32 via a fetch instruction.

//ANN:13
for(let k = 0; k < contours.size(); k++){
  cnt = contours.get(k); 
  Moments = cv.moments(cnt,false);
  M00Array[k] = Moments.m00;
  // cnt.delete();
}

//ANN13A
let max_area_arg = MaxAreaArg(M00Array);
console.log("MAXAREAARG = "+max_area_arg);

//let TestArray = [0,0,0,15,4,15,2];
//let TestArray0 = [];
//let max_test_area_arg = MaxAreaArg(TestArray0);
//console.log("MAXTESTAREAARG = "+max_test_area_arg);

let ArgMaxArea = MaxAreaArg(M00Array);
if(ArgMaxArea >= 0){
cnt = contours.get(MaxAreaArg(M00Array));  //use the contour with biggest MOO
//cnt = contours.get(54);
Moments = cv.moments(cnt,false);
M00 = Moments.m00;
M10 = Moments.m10;
M01 = Moments.m01;
x_cm = M10/M00;    // 75 for circle_9.jpg
y_cm = M01/M00;    // 41 for circle_9.jpg

XCMoutput.innerHTML = Math.round(x_cm);
YCMoutput.innerHTML = Math.round(y_cm);

console.log("M00 = "+M00);  
console.log("XCM = "+Math.round(x_cm));
console.log("YCM = "+Math.round(y_cm)); 

//fetch(document.location.origin+'/?xcm='+Math.round(x_cm)+';stop');
fetch(document.location.origin+'/?cm='+Math.round(x_cm)+';'+Math.round(y_cm)+';stop');

console.log("M00ARRAY = " + M00Array);

During the running of the program, the centroid coordinates are seen printed in the serial monitor as well as in the console.log and in the text box in the browser screen. The ESP32 can use the centroid data for tracking purposes in robotic applications.

ANN:14 Marks code for a blue bounding rectangle which bounds the largest area contour and the centroid of that contour. These can be seen in the lower image in the right hand column of the browser screen.

//ANN:14   
    
//**************min area bounding rect********************
//let rotatedRect=cv.minAreaRect(cnt);
//let vertices = cv.RotatedRect.points(rotatedRect);

//for(let j=0;j<4;j++){
//    cv.line(src,vertices[j],
//        vertices[(j+1)%4],[0,0,255,255],2,cv.LINE_AA,0);
//}
//***************end min area bounding rect*************************************


//***************bounding rect***************************
let rect = cv.boundingRect(cnt);
let point1 = new cv.Point(rect.x,rect.y);
let point2 = new cv.Point(rect.x+rect.width,rect.y+rect.height);

cv.rectangle(src,point1,point2,[0,0,255,255],2,cv.LINE_AA,0);
//*************end bounding rect***************************

//*************draw center point*********************
let point3 = new cv.Point(x_cm,y_cm);
cv.circle(src,point3,2,[0,0,255,255],2,cv.LINE_AA,0);
//***********end draw center point*********************

}//end if(ArgMaxArea >= 0)
else{
  if(ArgMaxArea==-1){ 
    console.log("ZERO ARRAY LENGTH");
  }
  else{              //ArgMaxArea=-2
    console.log("DUPLICATE MAX ARRAY-ELEMENT");
  }
}

cnt.delete();

Below the lower image in the right-hand column, a text box contains selected outputs of the program including the X, Y Probe data, the centroid coordinates, and a catch output if an exception is generated as mentioned above.

ESP32-CAM Color Tracking Output Messages X Y coordinates

Uploading the Code

After inserting your network credentials and the pinout for the camera you’re using you can upload the code.

In the Tools menu select the following settings before uploading the code to your board.

BOARD: ESP32 Wrover Module
Flash Mode: “QIO”
PARTITION SCHEME: “Huge App (3Mb No OTA/1MB SPIFFS)”
Flash Frequency: “80 Mhz”
Upload Speed: “115200”
Core Debug Level: “None”

Testing the Program

After uploading the code, open the Serial Monitor at a baud rate of 115200. Press the on-board RST button, and the ESP IP address should be printed. In this case, the IP address is 192.168.1.95.

ESP32-CAM Web Server Color Tracking Demonstration OpenCV.js IP Address

Open a browser on your local network and type de ESP32-CAM IP address.

Open the console log when the browser opens. Check if OpenCV.js is loading properly. At the bottom right corner of the web page, it should display “OpenCV.JS READY”.

Then left-click the Color Detection button in the upper left column of the browser window.

You should see a similar window and no error messages.

ESP32-CAM Web Server Color Tracking Demonstration OpenCV.js

After setting the right settings to target a color in the Target-color Probe (as explained previously), click the Tracking button.

At the same time, the centroid coordinates of the target should be displayed on the web page as well as on the ESP32-CAM Serial Monitor.

ESP32-CAM Web Server Color Tracking Demonstration OpenCV.js Arduino IDE Serial Monitor

Wrapping Up

None of the elements of the project described in this tutorial are new. The ESP32 Camera Web Server and OpenCV have each been described extensively and in detail in the literature.

The novelty here has been the combining of these two technologies via OpenCV.js. ESP32 Camera, with its small size, wi-fi, high tech and low-cost capability promises to be an interesting new front-end image-capture capability for OpenCV web server applications.

Learn more about the ESP32-CAM

We hope you liked this project. Learn more about the ESP32-CAM with our tutorials:

About Andrew R. Sass

This project/tutorial was developed by Andrew R. Sass. We’ve edited the tutorial to match our tutorials’ style. Apart from some CSS, the code is the original provided by Andrew.

Author background: Andrew (“DOC”) R. Sass holds a BSEE(MIT), MSEE & PhD EE (PURDUE). He is a retired research engineer (integrated circuit components), a second-career retired teacher (AP Physics, Physics, Robotics), and has been a mentor of a local FIRST robotics team.

SMART HOME with Raspberry Pi, ESP32, ESP8266 [eBook]

Learn how to build a home automation system and we’ll cover the following main subjects: Node-RED, Node-RED Dashboard, Raspberry Pi, ESP32, ESP8266, MQTT, and InfluxDB database DOWNLOAD »

Recommended Resources

Build a Home Automation System from Scratch » With Raspberry Pi, ESP8266, Arduino, and Node-RED.

Home Automation using ESP8266 eBook and video course » Build IoT and home automation projects.

Arduino Step-by-Step Projects » Build 25 Arduino projects with our course, even with no prior experience!

What to Read Next…

Wemos LoLin32 ESP32 ssd1306 OLED: Pinout, Libraries and OLED Control (Arduino IDE and MicroPython)

Enjoyed this project? Stay updated by subscribing our newsletter!

28 thoughts on “ESP32-CAM Web Server with OpenCV.js: Color Detection and Tracking”

Eduardo Alvim

December 17, 2020 at 12:22 am

Just a sugestion to change in the code on the beginning of the loop() function:
From: “ReceiveState=0,cmdState=1,strState=1,questionstate=0,equalstate=0,semicolonstate=0;”
To: “ReceiveState = 0; cmdState = 1; strState = 1; questionstate = 0; equalstate = 0; semicolonstate = 0;” (Changing the commas for semicolons).

By the way, great post. I’m reading and writing the code yet, but it is being a great time. Thanks!
Reply
- Sara Santos
  
  December 17, 2020 at 11:16 am
  
  Great! Thanks 🙂
  Reply
Domenico

December 17, 2020 at 10:41 am

Hi Sara, great project, thanks! As far as I know, there should be the TensorFlow framework on top of OpenCV both for Computer Vision (CV). Are I wrong ?
Thanks!
Reply
Domenico

December 17, 2020 at 10:45 am

Eduardo, keep me posted, thanks.
Reply
Michael

December 18, 2020 at 6:24 pm

Hi Sara,
to speed up the tracking I changed the timout in line 710 to 50 ms:
setTimeout(function(){colorDetect.click();},50);
The tracking is much more like realtime then.
Additionally the values in the Serial Monitor are much faster when I comment out line 265 in the main program ( // (delay(1); ) . This works well after a fresh start.
All together this is a very nice project, I am impressed. Thank you for this work.
Reply
- Sara Santos
  
  December 19, 2020 at 11:51 am
  
  Hi.
  Thank you for sharing those improvements.
  Regards,
  Sara
  Reply
kpanchal

March 23, 2021 at 11:18 am

Hi Sara,

Thanks for the great project…!!!
Is there a way to improve night vision on ESP32-CAM ?

Regards,
Reply
- Sara Santos
  
  March 23, 2021 at 11:49 am
  
  Hi.
  One of our readers shared this video with us some time ago about night vision with the ESP32-CAM.
  I hope it helps: https://www.youtube.com/watch?v=r-QS3pTA8no
  Regards,
  Sara
  Reply
Rajvinder

May 27, 2021 at 8:46 am

Hi, may i know how to create the web server. Thank you.
Reply
Habib

July 17, 2021 at 6:28 pm

Great tutorial Sara!

Is it possible to run this code without web server and only get X axis and Y axis value? Because I want to try make something similar like PixyCam object tracking, Thank you.
Reply
Havid Abdus Sobrian

August 15, 2021 at 1:01 am

Very great tutorial, and it works very well, although the camera is very slow to detect every movement. Also a quick question, how can i send the color data from the color detection?
Reply
Jim Bean

August 18, 2021 at 11:17 am

Thank you for a great article that is working well on my ESP32-CAM.

Please can someone tell me how I can modify this line:

https://docs.opencv.org/master/opencv.js

So that I can read this js file from my SD card ? I want to run this offline and not have to download a huge file from the net every time.

I need to ask because this software uses the WiFi server and not a webserver, which has methods to do this.

Any help sincerely appreciated.

Jim.
Reply
Tamarni

December 10, 2021 at 10:48 am

Could you do a version of this compatible with the async library? Please.
Reply
Ramesh

January 9, 2022 at 4:27 pm

Hi I need to do Ball/object following robot using esp32cam can u help me please
Reply
Jamy

March 25, 2022 at 3:15 pm

Hi Sara
I have read your project and it’s very amazing. I am wondering if you can help me with the Harvesting fruit robot arm project. Please let me know
Reply
- Sara Santos
  
  March 25, 2022 at 6:58 pm
  
  Hi.
  I’m sorry, but we don’t do consulting or custom projects for clients.
  Regards,
  Sara
  Reply
Arturo

April 27, 2022 at 6:42 am

Hi Sara, I’m running your project using vscode and pltaform IO. I can build and upload the project into the esp32 cam, but once getting and using the IP address it doesn’t show any video only the webpage with all the setting and it shows the open cv is ready. I don’t seem to find the issue, may I please get an advice of where to look or possible solutions?
Reply
TUNDE

May 9, 2022 at 3:36 pm

Hi Sara,
i tried this project and the only picture i got on the local webpage is a still image instead of video stream at the left top instead and the still image gets updated only when i click on color detection. nothing displays in IMAGE MASK and IMAGE CANVAS. please can you give me an hint of how to resolve this.
Reply
- TUNDE
  
  May 9, 2022 at 3:50 pm
  
  I have resolved the problem. i had a bad internet connection. when the internet was available and the READY sign displays…then everything works fine. Thanks…good project.
  Reply
guy sharivker

May 30, 2022 at 2:08 pm

hi thank you very much!
i would like when i detection some color to output voltage from one of the pins of the camera can i ?
Reply
Jonatan

November 21, 2023 at 12:04 pm

Hi Sara,
Thank you for a great article!
Have you done a similar project for detecting numbers or letters?
thank you
Reply
Saman

December 12, 2023 at 6:12 am

Hi Sara
Thank you for the great project!
How could I fix the IP? The third number is changed after discounting the IP-Camera.
Cheers
Saman
Reply
- Sara Santos
  
  December 14, 2023 at 11:36 am
  
  Hi.
  Take a look at this tutorial: https://randomnerdtutorials.com/esp32-cam-static-fixed-ip-address-arduino/
  Regards,
  Sara
  Reply
Ali Navaid

January 29, 2024 at 9:00 pm

E (487) camera: Camera probe failed with error 0x105(ESP_ERR_NOT_FOUND)
Camera init failed with error 0x105ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x33 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0030,len:1344
load:0x40078000,len:13964
load:0x40080400,len:3600
entry 0x400805f0
Reply
- steliosaath
  
  February 24, 2024 at 8:51 am
  
  same problem here ;/
  Reply
- zhsummersy
  
  April 30, 2024 at 2:24 am
  
  same problem
  Reply
Tidiane Kone

January 31, 2025 at 7:49 pm

Hello Sara,

Very nice work done above! I am trying to run your code with the ESP32-S3 Firebeetle 2, but I’m facing issues with the video on the web-interface. The video is stuck at one image, and I need to click the color detection button to update the image. Please let me know what could be the solution for this.
Reply
Sadek

February 25, 2025 at 8:32 am

Hey please if someone has the same issues
so I trained my model for object detection with yolov5 for specific objects and when I tried to run it in esp32cam it didn’t work at all it just showed me the stream without tracking anything and it worked on a laptop someone told me to see this tutorial and other told to use deepsparse but I’m not familiar with that can someone please guide me
Reply