DM36x Face Detection

From RidgeRun Developer Connection
Revision as of 09:50, 24 July 2013 by Tfischer (Talk | contribs)

Jump to: navigation, search

Introduction

A face detect video is the simpliest way to see what fact detection is all about.

DM36x with part number ending in "F" (e.g DM368ZCEF) includes a face detection hardware module. The face detection module provides capabilities to detect multiple faces in a QVGA image/video frame generated by the dm36x VPFE. This module is capable of detecting faces with face inclination +/- 45°, face direction +/- 30° in vertical direction and +/- 60° in horizontal direction, with a maximum detection count of 35 faces. The output of this module for each face is size, angle, position and confidence level in asserting an area contains a face.

RidgeRun sells Face Detect as an SDK add-on.

Figure 1. Supported positions of face. (Source: DM36x Face Detection)

Face detection is supported using a Linux driver which controls access to the face detect hardware. The face detect driver is used as a loadable module. The driver only accepts Y (luma) input data of 320x240 resolution frame and provides configurable parameters to define the hardware functionality. The driver handles the interrupts generated by the face detection hardware at the completion of frame processing.

To simply face detection, a GStreamer plugin, dm365facedetect, can be used. The GStreamer dm365facedetect element adds face detection functionality to video pipelines. The purpose of this pipeline is to facilitate the interaction between capture buffers, face detection driver and user applications. dm365facedetect provides the following functionality:

  • Initialization: includes face detection driver open/close and memory allocation.
  • Configuration: handles driver configuration and scaling setup.
  • Execution: scales and copies each input buffer Y data to fit driver requirements, executes face detection algorithms and get resultant faces information.
  • Drawing: as an optional functionality, draws a square around each detected face.

This element emits a signal each time at least one face was detected. The data corresponding to faces size(width and height), angle, position (top-left point) and confidence level is sent with the signal. A confidence level threshold can be defined allowing automatic discarding of detected faces that do not meet the threshold.

References

Linux driver

The face detect driver uses four structures and supports four IOCTLs. All four IOCTLs are passed the same data structure; namely struct facedetect_params_t.

struct facedetect_inputdata_t

Specify the configuration of the face detection hardware.

struct facedetect_inputdata_t {
    char enable;                        /* Facedetect: TRUE: Enable, FALSE: Disable */
    char intEnable;                     /* Interrupt: TRUE: Enable, FALSE: Disable */
    unsigned char *inputAddr;           /* Picture data address in SDRAM */
    unsigned char *workAreaAddr;        /* Work area address in SDRAM */
    unsigned char direction;            /* Direction : UP, RIGHT, LEFT */
    unsigned char minFaceSize;          /* Min face size 20,25,32,30 Pixels */
    unsigned short inputImageStartX;    /* Image start X */
    unsigned short inputImageStartY;    /* Image start Y */
    unsigned short inputImageWidth;     /* Image Width */
    unsigned short inputImageHeight;    /* Image Height */
    unsigned char ThresholdValue;       /* Face detect Threshold value */
}

struct facedetect_position_t

Describe a single face detect result.

struct facedetect_position_t {
    unsigned short resultX;
    unsigned short resultY;
    unsigned short resultConfidenceLevel;
    unsigned short resultSize;
    unsigned short resultAngle;
}

struct facedetect_outputdata_t

Provide an array of face detect results.

struct facedetect_outputdata_t {
    struct facedetect_position_t face_position[35];
    unsigned char faceCount;
};

struct facedetect_params_t

Pass in configuration and buffer to hold results. Used with IOCTL FACE_DETECT_SET_HW_PARAM.

struct facedetect_params_t {
    struct facedetect_inputdata_t inputdata;
    struct facedetect_outputdata_t outputdata;
}

IOCTL - FACE_DETECT_SET_HW_PARAM

Accepts struct facedetect_params_t to configure the face detection hardware and a buffer to hold the results.

IOCTL - FACE_DETECT_EXECUTE

Accepts struct facedetect_params_t. Fills in the outputdata with the results of running face detection on the buffer. inputdata is unused. You must first call FACE_DETECT_SET_HW_PARAM.

IOCTL - FACE_DETECT_SET_BUFFER

Accepts struct facedetect_params_t to configure just the input buffer handling of the face detection hardware. Useful on the second and following video frames when just new input buffers need to be specified. Internally, FACE_DETECT_SET_HW_PARAM invokes this functionality.

IOCTL - FACE_DETECT_GET_HW_PARAM

Accepts struct facedetect_params_t. Fills in the inputdata with the current settings. outputdata is unused.

GStreamer dm365facedetect element

The gst-inspect tool returns the following information about the dm365facedetect element. Notice that the dm365facedetect element parameters

  • fdwidth
  • fdheight
  • fdstarty
  • fdstartx
  • min-face-size
  • face-orientation
  • threshold

Directly match the values in the Linux driver facedetect_inputdata_t structure.

The draw-square parameter provides a simple mechanism for verifying face detection is working as expected.

Factory Details:
  Long name:	DM365 Face Detection
  Class:	Hardware
  Description:	Elements that detect faces and sends its coordinates
  Author(s):	Melissa Montero <<melissa.montero@ridgerun.com>>
  Rank:		primary (256)

Plugin Details:
  Name:			TICodecPlugin
  Description:		Plugin for TI xDM-Based Codecs
  Filename:		/usr/lib/gstreamer-0.10/libgstticodecplugin.so
  Version:		0.10.0.1
  License:		LGPL
  Source module:	gstticodecplugin
  Binary package:	TI / RidgeRun
  Origin URL:		http://www.ti.com/, http://www.ridgerun.com

GObject
 +----GstObject
       +----GstElement
             +----GstBaseTransform
                   +----GstDm365Facedetect

Pad Templates:
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw-yuv
                 format: NV12
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]

  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw-yuv
                 format: NV12
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]


Element Flags:
  no flags set

Element Implementation:
  Has change_state() function: gst_dm365facedetect_change_state
  Has custom save_thyself() function: gst_element_save_thyself
  Has custom restore_thyself() function: gst_element_restore_thyself

Element has no clocking capabilities.
Element has no indexing capabilities.
Element has no URI handling capabilities.

Pads:
  SRC: 'src'
    Implementation:
      Has getrangefunc(): gst_base_transform_getrange
      Has custom eventfunc(): gst_base_transform_src_event
      Has custom queryfunc(): gst_base_transform_query
        Provides query types:
		(1):	position (Current position)
      Has custom iterintlinkfunc(): gst_pad_iterate_internal_links_default
      Has getcapsfunc(): gst_base_transform_getcaps
      Has acceptcapsfunc(): gst_base_transform_acceptcaps
    Pad Template: 'src'
  SINK: 'sink'
    Implementation:
      Has chainfunc(): gst_base_transform_chain
      Has custom eventfunc(): gst_base_transform_sink_event
      Has custom queryfunc(): gst_base_transform_query
        Provides query types:
		(1):	position (Current position)
      Has custom iterintlinkfunc(): gst_pad_iterate_internal_links_default
      Has bufferallocfunc(): gst_base_transform_buffer_alloc
      Has getcapsfunc(): gst_base_transform_getcaps
      Has setcapsfunc(): gst_base_transform_setcaps
      Has acceptcapsfunc(): gst_base_transform_acceptcaps
    Pad Template: 'sink'

Element Properties:
  name                : The name of the object
                        flags: readable, writable
                        String. Default: null Current: "dm365facedetect0"
  qos                 : Handle Quality-of-Service events
                        flags: readable, writable
                        Boolean. Default: false Current: false
  fdwidth             : Define width of region to apply face detection
			If no width is defined the input image's width will be use
                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 Current: 0
  fdheight            : Define height of region to apply face detection
			If no height is defined the input image's height will be use
                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 Current: 0
  fdstarty            : Define the vertical start pixel of face detection's region

                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 Current: 0
  fdstartx            : Define the horizontal start pixel of face detection's region

                        flags: readable, writable
                        Integer. Range: 0 - 2147483647 Default: 0 Current: 0
  min-face-size       : Indicates the minimun face's size that can be detected in the input frame of 320x192
			0 - 20x20 pixels
			1 - 25x25 pixels
			2 - 32x32 pixels
			3 - 40x40 pixels
                        flags: readable, writable
                        Unsigned Integer. Range: 0 - 3 Default: 2 Current: 2
  face-orientation    : Indicates the face's orientation that will be detected, were an angle of
			0 degrees corresponds to the vertical axis:
			0 - Faces in the up direction (around 0 degrees)
			1 - Faces in the right direction (around +90 degrees)
			2 - Faces in the left direction (around -90 degrees)
                        flags: readable, writable
                        Unsigned Integer. Range: 0 - 2 Default: 0 Current: 0
  threshold           : Defines the threshold of detection. Possibility of detecting faces 
			goes higher with setting a lower value but the probability of false
			face detection increases
                        flags: readable, writable
                        Unsigned Integer. Range: 0 - 9 Default: 4 Current: 4
  draw-square         : Draw squares around detected faces
                        flags: readable, writable
                        Boolean. Default: false Current: false

Element Signals:
  "face-detected" :  void user_function (GstElement* object,
                                         guint arg0,
                                         gpointer arg1,
                                         gpointer user_data);

Example application

gst-facedetect-test is an example application created with the purpose of test dm365facedetect. The application allows you to create a preview or a playback pipeline. It manages the face-detected signal emitted by the dm365facedetect and prints the information for each face that is detected in the video.

The following command uses Mt9P031 sensor as the video source and directs the video to component output. You need to make sure you have your SDK configured for component output with both input and output video sizes set to 720p. You can use make config and make the changes in the architecture menu.

The application can be found at $DEVDIR/myapps/gst-dm36x-facedetect-test. The application works for both DM365 and DM368.

After you build and install the application, your can print the buil-in help information using:

facedetect_test

An example usage, with MT9P031 sensor with output going to 720p composite is:

fbset -disable
facedetect_test camera://width=1280,height=720 "min-face-size=0 draw-square=true"

You should see output similar to:

davinci_resizer davinci_resizer.2: RSZ_G_CONFIG:0:1:124
vpfe-capture vpfe-capture: IPIPE Chained
vpfe-capture vpfe-capture: Resizer present
vpfe-capture vpfe-capture: standard not supported
Playing ....

-> Type EXIT to stop the pipeline !!!!

2 face(s) detected!
-------Facedavinci_v4l2 davinci_v4l2.1: Before finishing with S_FMT:
layer.pix_fmt.bytesperline = 1280,
 layer.pix_fmt.width = 1280, 
 layer.pix_fmt.height = 720, 
 layer.pix_fmt.sizeimage =1382400
 #0--------
Locdavinci_v4l2 davinci_v4l2.1: pixfmt->width = 1280,
 layer->layer_info.config.line_length= 1280
ation = (800, 96)
Size = 448x480
Angle = 0
Confidence = 1
-------Face #1--------
Location = (0, 0)
Size = 0x0
Angle = 0
Confidence = 0

2 face(s) detected!
-------Face #0--------
Location = (800, 96)
Size = 448x480
Angle = 0
Confidence = 1
-------Face #1--------
Location = (64, 416)
Size = 128x128
Angle = 0
Confidence = 1

2 face(s) detected!
-------Face #0--------
Location = (800, 96)
Size = 480x480
Angle = 0
Confidence = 1
-------Face #1--------
Location = (64, 416)
Size = 128x128
Angle = 0
Confidence = 1

A real application would use the location and size information provided for each detected face. In the example application, those values are just printed to the console.

GStreamer example pipeline

To check gstreamer element functionality you can always use gst-launch.

MT9P031 to component output

The following command uses component output. You need to make sure you have your SDK configured for component output with both input and output video size set to 720p. You can use make config and make the changes in the architecture menu.

Here is a MT9P031 camera pipeline directing the video to component output The dm365facedetect's property of drawing squares around the detected faces is enabled:

fbset -disable
gst-launch v4l2src always-copy=false chain-ipipe=true input-src=camera ! video/x-raw-yuv, format=\(fourcc\)NV12, width=640, \
           height=480 ! dmaiaccel ! dm365facedetect draw-square=true  min-face-size=0  ! dmaiperf ! TIDmaiVideoSink \
           enable-last-buffer=false videoStd=720P_60 videoOutput=component sync=false

You can expect output similar to:

Setting pipeline to PAUSED ...
davinci_resizer davinci_resizer.2: RSZ_G_CONFIG:0:1:124
vpfe-capture vpfe-capture: IPIPE Chained
vpfe-capture vpfe-capture: Resizer present
vpfe-capture vpfe-capture: standard not supported
Pipeline is live and does not need PREROLL ...
WARNING: from element /GstPipeline:pipeline0/GstV4l2Src:v4l2src0: Failed to set norm for device '/dev/video0'.
Additional debug info:
../../../src/sys/v4l2/v4l2_calls.c(743): gst_v4l2_set_norm (): /GstPipeline:pipeline0/GstV4l2Src:v4l2src0:
system error: Invalid argument
WARNING: from element /GstPipeline:pipeline0/GstDmaiperf:dmaiperf0: Could not get/set settings from/on resource.
Additional debug info:
../../src/src/gsttidmaiperf.c(273): gst_dmaiperf_start (): /GstPipeline:pipeline0/GstDmaiperf:dmaiperf0:
Engine name not specified, not printing DSP information
WARNING: from element /GstPipeline:pipeline0/GstV4l2Src:v4l2src0: Video input device did not accept new frame rate setting.
Additional debug idavinci_v4l2 davinci_v4l2.1: Before finishing with S_FMT:
layer.pix_fmt.bytesperline = 1280,
 layer.pix_fmt.width = 1280, 
 layer.pix_fmt.height = 720, 
 layer.pix_fmt.sizeimage =1382400
nfo:
../../../sdavinci_v4l2 davinci_v4l2.1: pixfmt->width = 1280,
 layer->layer_info.config.line_length= 1280
rc/sys/v4l2/v4l2src_calls.c(342): gst_v4l2src_set_capture (): /GstPipeline:pipeline0/GstV4l2Src:v4l2src0:
system error: Invalid argument
Setting pipeline to PLAYING ...
New clock: GstSystemClock
INFO:
Timestamp: 0:01:30.801968310; bps: 0; fps: 0; 
INFO:
Timestamp: 0:01:31.827916893; bps: 14385951; fps: 31; 
INFO:
Timestamp: 0:01:32.847987936; bps: 16263529; fps: 35; 
INFO:
Timestamp: 0:01:33.868019269; bps: 16263529; fps: 35; 
INFO:
Timestamp: 0:01:34.888134102; bps: 16263529; fps: 35; 
INFO:
Timestamp: 0:01:35.908318851; bps: 16263529; fps: 35; 
INFO:
Timestamp: 0:01:36.928417643; bps: 16263529; fps: 35;

MT9P031 to H.264 encoded mp4 file

You can also use GStreamer to save the video, with boxes around the detected faces, to a file.

gst-launch v4l2src always-copy=false chain-ipipe=true input-src=camera ! video/x-raw-yuv, format=\(fourcc\)NV12, width=640, \
           height=480 ! dmaiaccel ! dm365facedetect draw-square=true min-face-size=0 ! dmaienc_h264 encodingpreset=2 targetbitrate=3600000 ! \
           dmaiperf ! queue ! qtmux ! filesink location=test.mp4

The folllowing video show face detect in action, captured with the above gst-launch command. The video is dark (we left auto-exposure off) to show how well the Dm36x face detect hardware works.

Common error and how to fix them

Failed to create display

If the pipeline fails after the first frame due to a Failed to create display, as show below, then you need to make sure you have your SDK configured for component output with both input and output video sizes set to 720p. You can use make config and make the changes in the architecture menu.

Setting pipeline to PLAYING ...
New clock: GstSystemClock
INFO:
Timestamp: 0:02:11.792950768; bps: 0; fps: 0; 
ERROR: from element /GstPipeline:pipeline0/GstTIDmaiVideoSink:tidmaivideosink0: GStreamer encountered a general stream error.
Additional debug info:
../../src/src/gsttidmaivideosink.c(1003): gst_tidmaivideosink_init_display (): /GstPipeline:pipeline0/GstTIDmaiVideoSink:tidmaivideosink0:
Failed to create display
Execution ended after 85642957 ns.