How to use the Dataset Shop

29 August 2022

The main purpose of the Dataset Shop is to provide legally-clean datasets to machine-learning developers, data scientists and AI researchers.

We are creating a growing resources of millions of images, that have been categorized, tagged and packaged for optimal selection, augmentation and access.

This article will provide you with the key ways to use this marketplace, get support and manage your account.

Human datasets

The core content that inspired the creation of this marketplace is the collection of biometrically-clean, real-life human datasets.

The vAIsual team (creators of the Dataset Shop), have assembled a dataset of over 500,000 images taken from over 1500 identities of people, shot in a photographic studio. Each model has signed a biometric model release that is GDPR and PII compliant.

The dataset consists of hundreds of photographs that capture people from all angles, with multiple expressions.

This content will exponentially increase in variety and quantity as the models are now being shot on video.

You will find statistics on demographic breakdown of the subjects in the description for the dataset. Although the dataset does not contain equal representation for all demographic characteristics, we are striving to be broad and inclusive with what we provide. If you have specific needs which are not met in the current content offering, contact us to discuss custom requests.

Pre-made datasets

The collection you encounter on the front page of the Dataset Shop are what we call pre-made datasets. This means our team has assembled datasets containing specific content that is bundled together, tagged, and presented as a dataset product.

This selection will continually grow as we new contributors to the Dataset Shop bring their collections to the store.

Each pre-made dataset contains information about the contents, resolution options, quantity of images and additional resources (such as model releases or demographic breakdowns).

Once you have found a dataset that meets you needs you can choose to augment the dataset and re-tag it (full descriptions for these features are below).

Custom datasets

Most customers have very specific requirements for machine learning training, so the DataSetShop offers the ability to custom build a dataset that meets those needs.

Follow the link to “Create Dataset” from the front page. 

You will then have the chance to search and select the content you are looking for. The search function works in such a way that allows you to select AND to compile keywords to make a larger choice, or use OR to narrow the selection down. 

Custom datasets cannot be purchased as a subscription, as your custom dataset will not be updated through the year, like many of the pre-made datasets will be. 

Image Resolution

Once you have made your selection of datasets from either the pre-made selection, or custom dataset you created, you need to decide on the resolution you wish to purchase.

Resolution ranges from 128K to 4K. Full resolution is available on request (contact us to let us know your needs).

Dataset Augmentation

You can grow your dataset size significantly by augmenting the images. The current options for augmentation include zoom, re-color and flip. These functions will automatically produce altered versions of the images in the dataset. 

Re-Tagging Function 

It’s often important to be able to re-tag images to suit your purposes for machine-learning. Once you have selected your dataset you can re-tag the entire dataset with new tags. The tags will be prepared as a text file that is downloaded as part of your dataset package.

One-time vs Subscription

Pre-made datasets can be purchased as a subscription so you can benefit from the additional files added throughout the year. This is particularly relevant to the vAIsual people dataset, that is growing every week with new images.

When you purchase a subscription you pay for the dataset, plus the first year of the subscription (at a 30% discount rate from the standard annual fee). After the first year, you pay an annual fee equivalent to 50% of the price of the original dataset.

Custom datasets are not available for subscription because they are not updated.

Extended License

If you intend to use the datasets for generating synthetic media then you need to purchase the Extended Licence during the purchasing process. 

Site Registration

The site registration enables you to have an account that means you can create and purchase datasets. We capture simple contact details when you register and more details during the purchase process. This information can be later edited or changed using My Account.

Account Management

You can edit or delete information from the My Account section of the site you will see when you login. 

Help and Support

There are many ways you can find information and support on the site. You can check the Documentation or FAQ section to see if there are answers to your questions.

If you are still unable to resolve your issue, you can contact our team directly through the Live Chat function in the lower right-hand corner of the website.

Alternatively, you can file a Help ticket so your issue is tracked and resolved by our customer support team.