Computer Vision Engineer

Full time

Apply Now

Responsibilities

Develop computer vision solutions with high-quality, robust, and scalable algorithms and models.
Collaborate with cross-functional teams to coordinate project activities, manage timelines, and ensure milestones are met.
Manage the training, evaluation, and fine-tuning of advanced deep learning models for tasks such as object detection, classification, object tracking, and work with models like CLIP and VLM.
Dockerize applications and work effectively with Git and CI/CD workflows to support smooth development, integration, and deployment.
Troubleshoot and resolve issues and bugs, while identifying opportunities to improve existing solutions for better scalability and efficiency.

Qualification & Experience

Experience in Python programming, with proficiency in Git and Docker containerization.
Experience contributing to the development of computer vision systems for surveillance applications, including violation detection, compliance monitoring, and analytics.
Familiarity with object detection techniques such as YOLO, DETR, RCNN, and classification algorithms.
Experience with model conversion tools like ONNX, TensorRT engine, and OpenVINO, including precision optimization and quantization techniques (FP16, INT8).
Experience with deep learning and vision frameworks such as PyTorch, TensorFlow, OpenCV, and Pillow, building systems optimized for CPU and GPU.
Ability to implement high-throughput, low-latency inference using tools such as TensorRT, DeepStream, custom CUDA kernels, and Triton Inference Server.
Good problem-solving skills and eagerness to learn new technologies and techniques.
Ability to collaborate effectively within a team and communicate technical concepts clearly.
Good to Have:
Experience with fine-tuning vision-language models (VLM).
Knowledge of advanced model quantization techniques beyond FP16 and INT8.
Familiarity with cloud platforms and deployment tools.
Understanding of data annotation and augmentation strategies for CV tasks.
Exposure to real-time video processing and multi-camera systems.