TensorFlow & PyTorch are arguably the most popular deep neural network (DNN) frameworks up to date, and these are still the status quo for deploying your models in production. However, these are unfortunately not always the best performing frameworks for running inference of your DNN models. In fact, a plethora of other inference engines have been developed over recent years, e.g. OpenVINO, CUDA, TensorRT, etc., all trying to improve the inference speed of your models on various hardware platforms.
Although development along these lines is very welcoming and great for the ML community, the caveat is that all of these…
ToriML is a COSS company building tools for developers and enterprises for inference acceleration and model deployment of ML algorithms.