What is Training Data Labeling for Computer Vision AI?

What is Training Data Labeling for Computer Vision AI?

Applications of Computer vision are very large in number. Computer’s ability to understand a construction-site is very different from a computer’s ability to understand a bird’s eye view of a farm. Source used to capture the data plays an important role in quality of data. For computers to understand image data they need to be fed millions of samples of any specific type of data. Unfortunately feeding millions of samples of data is not sufficient. It is very important to accurately mark position of various objects in massively large data-sets.

Requirement for level of accuracy in labeled data varies from application to application. Factors include a) complexity of scene, b) number of conflicting objects present in the scene, and c) volume of data available for training.

Complexity of Scene

Some scenes are very simple with objects easily distinguishable from each other. There is very little overlap among individual items. For example in image shown below cars, road, driving-lanes, trees, sky are easily distinguishable from one another.

Low Complexity Scene With Accurate Labels

Number of Conflicting Objects

When one type of object is occluding another different type of object, that results in incomplete data. In this scenario inaccurate labeling can confuse the neural network. In image shown below, walking person is occluding the car and hence car’s features cannot be used for training the model. In such cases it is important not to give confusing labels as input to neural network.

Occlusion of car by walking person.(Image subject to Copyright)

Volume of Available Training Data

Volume of available training data plays a very important role in training neural networks with high prediction accuracy. If volume of data for a specific scenario is small, such as in case of radiology, it is important to feed accurate labels while training the network. Image below show comparison of pixel accurate labels of liver cancer on left as opposed to low accuracy label on right.

TrainingData.io: AI Assisted Labeling for Computer Vision Training Data @ Scale

Team at TrainingData.io is dedicated in bringing quality control to labeling workflow. High precision labeling software helps in creating high quality annotations. Checkout our solution here.

Precision in labeling is enabled with features like:

  1. Superpixel image segmentation with brush and eraser.
  2. Growth tools on a region of interest.
  3. Accurate free hand tool with sculpter.
  4. Polygon tool with advanced editing.