🦞🌯 Lobster Roll

Thread

Training Data for the Price of a Sandwich: Common Crawl’s Impact on Generative AI (foundation.mozilla.org)

Stories related to "Training Data for the Price of a Sandwich: Common Crawl’s Impact on Generative AI" across the full archive.

Training Data for the Price of a Sandwich: Common Crawl’s Impact on Generative AI (foundation.mozilla.org)
12 Common Mistakes while Backing Up Databases (ib-aid.com)
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database (2009) (sigmod09.org)
Common Rails Idioms that Kill Database Performance (blog.honeybadger.io)
Transactional Data Operations in PostgreSQL Using Common Table Expressions (rob.conery.io)
Wikidata and Mundaneum - The Triumph of the Commons (schmud.de)
Immutable Persistent Data Structures in Common Lisp (blog.thezerobit.com)
Rich Hickey: Deconstructing the Database (66 min. talk) (youtube.com)
Labrador is a SQL and Mongo development database browser (chrismccord.github.com)
Spanner: Google's Globally-Distributed Database (research.google.com)
Handling Database Failover at Craigslist (blog.zawodny.com)
Querying PostgreSQL datatypes in ActiveRecord with postgres_ext (reefpoints.dockyard.com)
Test Driving Database Indexes (myronmars.to)
pgModeler - PostgreSQL Database Modeler (pgmodeler.com.br)
MDCC: Multi-Data Center Consistency (mdcc.cs.berkeley.edu)
Exploring SQLite Internals Part I: the Virtual Database Engine (coderweekly.com)
metamx/druid - Fast, Distributed Column-oriented Datastore Optimized for Analysis (github.com)
How I Learned to Stop Worrying and Love Automated Database Failover (braintreepayments.com)
A vendor-independent comparison of NoSQL databases: Cassandra, HBase, MongoDB, Riak (networkworld.com)
RethinkDB - a distributed document database for scalable web applications (rethinkdb.com)
CodernityDB pure python, NoSQL, fast database — CodernityDB (labs.codernity.com)
Big Data and the US Presidential Campaign - YouTube (youtube.com)
Sharding your database - A high level overview (craigkerstiens.com)
Netflix Queue: Data migration for a high volume web application (techblog.netflix.com)
Fixing Your Database Connections in Django (craigkerstiens.com)
Introducing Tabula - Upload a PDF, get back tabular CSV data (source.mozillaopennews.org)
How FriendFeed uses MySQL to store schema-less data (2009) (backchannel.org)
Landmark Steps to Liberate [US Government] Open Data (whitehouse.gov)
PostgreSQL as a Schemaless Database (thebuild.com)
EJDB embedded database (ejdb.org)