Thread
Stories related to "Common Crawl" across the full archive.
You may have heard the term "code smells" lately, it seems its being talked about frequently again. In this short post I’ll explain what they are, and a few of them you may run across.
Here is a link to the [Github repo](https://github.com/Inaimathi/cl-notebook)
I had fun hacking this together. I hope someone else finds it useful.
[Blog post with more background](http://gerikson.com/blog/comp/Introducing-HN-LO.html).
Training Data for the Price of a Sandwich: Common Crawl’s Impact on Generative AI
(foundation.mozilla.org)