Thread
Stories related to "Fast subsets of large datasets with Pandas and SQLite" across the full archive.
Additional information: https://github.com/maxogden/dat/blob/master/what-is-dat.md
Also:
git repo: https://github.com/maxogden/dat
example usage commands: https://github.com/maxogden/dat/blob/master/usage.md
technical notes / supported formats: https://github.com/maxogden/dat/blob/maste...
> [..] In this talk, we present several methods that make *the large scale
security analyses of embedded devices* a feasible task. We implemented
those techniques in a scalable framework that we tested on real world data.
First, we collected a large number of firmware images from Internet
reposi...
Abstract: "An increasing amount of information today is generated, exchanged, and stored digitally. This also includes long-lived and highly sensitive information (e.g., electronic health records, governmental documents) whose integrity and confidentiality must be protected over decades or even cent...
I decided to share a CUDA kernel I wrote over 5 months ago. Nvidia's hardware and software may surprise you.
Hi HN,<p>I've recently spoken with two companies that mentioned the high costs of creating embeddings on their datasets for RAG applications. A PE firm shared that generating embeddings for new data rooms could cost up to $5K, limiting how often they do it.<p>I’m having trouble understanding wh...