Last Week in Computer Vision #20

Who Owns the Generative AI Platform?

Can you believe that it's been almost a month into 2023?

And already it's been a wild ride. Google just let go of 20K employees sending shockwaves through the tech community. Meanwhile, Generative AI wonder tools (Stability AI, Midjourney, etc.) are presented with copyright lawsuits. I guess it's true what they say - time flies when you're having fun! 😅

On a more positive note, we've got plenty to look forward to this year. Dozens of new AI tools are being released every day. I've been experimenting and I must admit some of them are absolutely godsent! I'll be sharing my favorite AI tools with you in the near future, stay tuned. 🤖

Anyways...let's get to this issue of Ground Truth.👇

🤓 Author Picks

generative AI will have a massive impact on the software industry and beyond. A16z tries to map out the dynamics of the market and start to answer broader questions about generative AI business models.

Google Research published a great overview of their research in 2022 and their vision for 2023.

The hype around artificial intelligence and the arguments for and against using it in products.

🎓 Insights & Learning

Machine Learning model depends on data, the idea of a data engine is to: continually improve your training data to continually improve your model.

Research Engineers from Google and Harward crystalized their approach to deep learning, important issues they have encountered in their work and struggles with hyperparameter tuning into a detailed guide.

This article is designed to help you prepare for the job market and get yourself noticed in the industry.

Feasible techniques to deal with large-scale (with many classes) ML Classification models.

🔁 MLOps & DataOps

How to design an image acquisition system solution to boost your CV model’s accuracy and sensitivity.

Practical tips you can implement for your use case, and see immediate value, split into maturity levels (beginner, intermediate, and advanced).

The Story and The Takeaways from launching an ML initiative in a fictional logistics company.

🛠️ Datasets & Libraries

YOLOv8 is the latest version of the YOLO object detection and image segmentation model developed by Ultralytics.

CloudSEN12 is a LARGE dataset (~1 TB) for cloud semantic understanding that consists of 49,400 image patches evenly spread throughout all continents except Antarctica.

A free, open pipeline on Kaggle machine-readable arXiv dataset: a repository of 1.7 million articles, with relevant features such as article titles, authors, categories, abstracts, full-text PDFs, and more.

Python package to help machine learning engineers, Import, Normalize, Save, Open, etc. image datasets more efficiently.

🔭 Research

EDICT, from Salesforce research, is a new algorithm that enables a wide range of image edits—from local and global semantic edits to image stylization—while maintaining fidelity to the original image structure.

Muse is a text-to-image Transformer model that achieves SOTA image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space.

In this paper, the authors survey the research landscape for data collection and data quality primarily for deep learning applications. They study data validation, cleaning, and integration techniques, fairness measures and unfairness mitigation techniques applied before, during, or after model training.

Thank you for reading Ground Truth. If you enjoy it, don’t be greedy, share with friends :)

Join the conversation

or to participate.