Clothes dataset.
Data collection.

Data is an essential part of any ML project. The question "what data do you have?" more important than the question "what do you need?" Two months ago, I found the article by Alexey Grigorev on medium.com "Clothing Dataset: Call for Action." He is writing a book about machine learning and has asked the community to help him to collect a dataset for the book. We specialize in collecting CV datasets, so we decided to collaborate and create a dataset accessible and useful
to everyone.
A few days later, we sent the first batch of 1009 images.

The most important part is the quality assessment of the first batch. After that, we form the dataset's final requirements. Feedback received:
  • Some of the photos have been rotated
  • Several images did not meet the criteria
  • It was necessary to sort the pictures by authors to train and test the model on different sets.
Clothes dataset sample
Feedback from Alexey on some image that didn't satisfy the requirements.
The subsequent batches were collected, taking into account
the feedback.

Let me tell you briefly about our process of collecting datasets.


STEP 1. You can place an order by describing the task in a free form on our website by filling out the form or by sending an email.


STEP 2. The manager will contact you to clarify the task, conditions, and terms. Usually, we make a small part, 5-10% of the required volume, and send it to you for feedback and comments. Only 2-3 people are involved in the collection since this is usually a small volume.


STEP 3. After receiving feedback on the trial batch, our employees start collecting. The number of people participating in the collection depends on the specifics, complexity, and number of images required. The pictures are sent to assess the quality and compliance with the client's requirements.


STEP 4. Two people independently check the images to make sure they match your requirements. If at least one moderator rejected the picture, then it will not be included in your dataset.


Image verification page
Clothes dataset rejecting picture
STEP 5. The rejected image is sent to the author to show that it does not meet the specified requirements.


In total, we have collected 3026 images, of which only six were rejected.

Very often, the images needed to train the model simply do not exist. In most cases, we can help you. We are ready for any of your requirements for quality, lighting, distance to the object, number of objects in the photo, etc. It is important to note that you get the rights to use the dataset for commercial and non-commercial purposes. You can be sure that you will not get any legal issues in the future. If you have any questions or would like to receive your dataset, please contact us via our form or by mail at hello@tagias.com.

Please read the story about how Alexey collected the dataset.

And also an article on how we label datasets.