This patch renames the InferenceItem to LastLevelTaskItem in the
three backends to avoid confusion among the meanings of these structs.
The following are the renames done in this patch:
1. extract_inference_from_task -> extract_lltask_from_task
2. InferenceItem -> LastLevelTaskItem
3. inference_queue -> lltask_queue
4. inference -> lltask
5. inference_count -> lltask_count
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Remove async flag from filter's perspective after the unification
of async and sync modes in the DNN backend.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit unifies the async and sync mode from the DNN filters'
perspective. As of this commit, the Native backend only supports
synchronous execution mode.
Now the user can switch between async and sync mode by using the
'async' option in the backend_configs. The values can be 1 for
async and 0 for sync mode of execution.
This commit affects the following filters:
1. vf_dnn_classify
2. vf_dnn_detect
3. vf_dnn_processing
4. vf_sr
5. vf_derain
This commit also updates the filters vf_dnn_detect and vf_dnn_classify
to send only the input frame and send NULL as output frame instead of
input frame to the DNN backends.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit adds the case handling if the asynchronous execution
of a request fails by checking the exit status of the thread when
joining before starting another execution. On failure, it does the
cleanup as well.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
The frame allocation and filling the TaskItem with execution
parameters is common in the three backends. This commit shifts
this logic to dnn_backend_common.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Since requests are running in parallel, there is inconsistency in
the status of the execution. To resolve it, we avoid using mutex
as it would result in single TF_Session running at a time. So add
TF_Status to the TFRequestItem
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This patch adds error handling for cases where the execute_model_tf
fails, clears the used memory in the TFRequestItem and finally pushes
it back to the request queue.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit enables async execution in the TensorFlow backend
and adds function to flush extra frames.
The async execution mechanism executes the TFInferRequests on
a separate thread which is joined before the next execution of
same TFRequestItem/while freeing the model.
The following is the comparison of this mechanism with the existing
sync mechanism on TensorFlow C API 2.5 CPU variant.
Async Mode: 4m32.846s
Sync Mode: 5m17.582s
The above was performed on super resolution filter using SRCNN model.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit adds a function for execution of TFInferRequest and documentation
for functions related to TFInferRequest.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
The reasons for including them don't exist any longer: ff_tlog() has
been moved to libavutil/internal.h and FF_QSCALE_TYPE_* has been moved
to qp_table.h.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is not used here at all; instead, add it where it is used without
including it or any of the arch-specific CPU headers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit adds handling for cases where an error may occur, clearing
the allocated memory resources.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit rearranges the existing code to create a separate function
for the completion callback in execute_model_tf.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit rearranges the existing code to create separate function
for filling request with execution data.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit uses TFRequestItem and the existing sync execution
mechanism to use request-based execution. It will help in adding
async functionality to the TensorFlow backend later.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit introduces a typedef TFInferRequest to store
execution parameters for a single call to the TensorFlow C API.
This typedef is used in the TFRequestItem.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit uses the common TaskItem and InferenceItem typedefs
for execution in TensorFlow backend.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
duplicate ff_hex_to_data() function from avformat and rename it to
hex_to_data() as static function.
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
Different function type of model requires different parameters, for
example, object detection detects lots of objects (cat/dog/...) in
the frame, and classifcation needs to know which object (cat or dog)
it is going to classify.
The current interface needs to add a new function with more parameters
to support new requirement, with this change, we can just add a new
struct (for example DNNExecClassifyParams) based on DNNExecBaseParams,
and so we can continue to use the current interface execute_model just
with params changed.
please use tools/python/tf_sess_config.py to get the sess_config after that.
note the byte order of session config is in normal order.
bump the MICRO version for the config change.
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
So the backend knows the usage of model is for frame processing,
detect, classify, etc. Each function type has different behavior
in backend when handling the input/output data of the model.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
from proc_from_frame_to_dnn to ff_proc_from_frame_to_dnn, and
from proc_from_dnn_to_frame to ff_proc_from_dnn_to_frame.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
'void *' is too flexible, since we can derive info from
AVFilterContext*, so we just unify the interface with this data
structure.
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
TensorFlow C library accepts config for session options to
set different parameters for the inference. This patch exports
this interface.
The config is a serialized tensorflow.ConfigProto proto, so we need
two steps to use it:
1. generate the serialized proto with python (see script example below)
the output looks like: 0xab...cd
where 0xcd is the least significant byte and 0xab is the most significant byte.
2. pass the python script output into ffmpeg with
dnn_processing=options=sess_config=0xab...cd
The following script is an example to specify one GPU. If the system contains
3 GPU cards, the visible_device_list could be '0', '1', '2', '0,1' etc.
'0' does not mean physical GPU card 0, we need to try and see.
And we can also add more opitions here to generate more serialized proto.
script example to generate serialized proto which specifies one GPU:
import tensorflow as tf
gpu_options = tf.GPUOptions(visible_device_list='0')
config = tf.ConfigProto(gpu_options=gpu_options)
s = config.SerializeToString()
b = ''.join("%02x" % int(ord(b)) for b in s[::-1])
print('0x%s' % b)
for some cases (for example, super resolution), the DNN model changes
the frame size which impacts the filter behavior, so the filter needs
to know the out frame size at very beginning.
Currently, the filter reuses DNNModule.execute_model to query the
out frame size, it is not clear from interface perspective, so add
a new explict interface DNNModel.get_output for such query.
suppose we have a detect and classify filter in the future, the
detect filter generates some bounding boxes (BBox) as AVFrame sidedata,
and the classify filter executes DNN model for each BBox. For each
BBox, we need to crop the AVFrame, copy data to DNN model input and do
the model execution. So we have to save the in_frame at DNNModel.set_input
and use it at DNNModule.execute_model, such saving is not feasible
when we support async execute_model.
This patch sets the in_frame as execution_model parameter, and so
all the information are put together within the same function for
each inference. It also makes easy to support BBox async inference.
Currently, every filter needs to provide code to transfer data from
AVFrame* to model input (DNNData*), and also from model output
(DNNData*) to AVFrame*. Actually, such transfer can be implemented
within DNN module, and so filter can focus on its own business logic.
DNN module also exports the function pointer pre_proc and post_proc
in struct DNNModel, just in case that a filter has its special logic
to transfer data between AVFrame* and DNNData*. The default implementation
within DNN module is used if the filter does not set pre/post_proc.
currently, output is set both at DNNModel.set_input_output and
DNNModule.execute_model, it makes sense that the output name is
provided at model inference time so all the output info is set
at a single place.
and so DNNModel.set_input_output is renamed to DNNModel.set_input
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
different backend might need different options for a better performance,
so, add the parameter into dnn interface, as a preparation.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
to support dnn networks more general, we need to know the input info
of the dnn model.
background:
The data type of dnn model's input could be float32, uint8 or fp16, etc.
And the w/h of input image could be fixed or variable.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
so, we can make a filter more general to accept different network
models, by adding a data type convertion after getting data from network.
After we add dt field into struct DNNData, it becomes the same as
DNNInputData, so merge them with one struct: DNNData.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>