A machine learning model learns all features from the contents of images and videos that constitute Training Data. For deep learning to work with high accuracy on visual data, high quality curation of Training Data is very important. Due to manual nature of this task, curation is done using outside help, usually in form of freelance workforce.

Training Data as Competitive advantage: Training Data is important part of any AI team’s intellectual property. In a world where machine learning model architecture is becoming standardized, an AI team’s competitive advantage is dependent on protecting security of their Training Data. Loss of Training Data to rivals is equivalent to major loss of IP.

Regulation and Compliance: Another reason to protect security of Training Data is to meet compliance or regulation. In case of medical imaging data, HIPAA compliance can make it important requirement that data stays with in secure network of an institution.

How to secure Training Data: At TrainingData.io we help AI teams run best-in-class annotation tools using Docker and VPN. Managing hundreds of data sets, projects and collaborators can be very difficult. TrainingData.io allows AI teams to manage meta data in cloud while keeping the image data protected with in the confines of local network.

Annotators (freelancers) connect to AI team’s network using VPN connection. Once connected to local network, Docker image containing the annotation tool can be accessed on the local network.

On-Premises Annotation Infrastructure