Learnings from a Machine Learning Engineer — Part 2: The Data Sets

Practical insights for a data-driven approach to model optimization

David Martin

Published in

Towards Data Science

8 min read

4 days ago

—

Photo by Conny Schneider on Unsplash

In Part 1, we discussed the importance of collecting good image data and assigning proper labels for your image classification project to be successful. Also, we talked about classes and sub-classes of your data. These may seem pretty straight forward concepts, but it’s important to have a solid understanding going forward. So, if you haven’t, please check it out.

Now we will discuss how to build the various data sets and the techniques that have worked well for my application. Then in the next part, we will dive into the evaluation of your models, beyond simple accuracy.

I will again use the example zoo animals image classification app.

Data Sets

As machine learning engineers, we are all familiar with the train-validation-test sets, but when we include the concept of sub-classes discussed in Part 1, and incorporate to concepts discussed below to set a minimum and maximum image count per class, as well as staged and synthetic data to the mix, the process gets a bit more complicated. I had to create a custom script to handle these options.

Incorporating ‘touch’ into social media interactions can increase feelings of support and approval

Including “tactile emoticons” into social media communications can enhance communication, according to a study published June 12, 2024 in the open-access journal PLOS ONE by

June 12, 2024

Nobel Prize Winner Geoffrey Hinton Explores Two Paths to Intelligence in AI Lecture | HackerNoon

On October 8, 2024, Geoffrey Hinton was awarded the Nobel Prize in Physics, along with John J. Hopfield, for his pioneering discoveries and inventions in

October 20, 2024

Navigating the world of manufacturing logistics

In Episode 179 of The Robot Report Podcast, we feature an interview with Anders Folgelberg, founder and CEO of FlexQube, a Swedish company specializing in

December 26, 2024

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

With expert analysis, comprehensive market coverage, and actionable insights, our newsletter equips you with the knowledge & tools necessary to make informed decisions & maximize your potential returns in the dynamic world of future tech stocks.