Qsv: Efficient CSV CLI Toolkit

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

qsv

14 2,258 9.9 Rust

CSVs sliced, diced & analyzed.

Thanks for the detailed feedback @snidane!
As maintainer of qsv, here's my reply:
- Given qsv's rapid release cycle (173 releases over three years), the auto-update check is essential at the moment. Once we reach 1.0, I'll turn it off. For now, given your feedback, I've only made it check 10% of the time.
- Pivot is in the backlog and I'll be sure to add unpivot when I implement it. (https://github.com/jqnatividad/qsv/issues/799)
- I'll add a dedicated summing command with the group by (-by) and window by (-over) capability (https://github.com/jqnatividad/qsv/issues/1514). Do note that `stats` has basic sum as @ezequiel-garzon pointed out.
- With the `enum` command, qsv can achieve what you proposed with `laminate`. E.g. qsv enum --new-column newcol --constant newconstant mydata.csv --output laminated-data.csv
- With the cat rowskey command, qsv can already concatenate files with mismatched headers.
- other file formats. qsv supports parquet, csv, tsv, excel, ods, datapackage, sqlite and more (see https://github.com/jqnatividad/qsv/tree/master#file-formats). Fixed-format though is not supported yet and quite interesting, and have added it to the backlog (https://github.com/jqnatividad/qsv/issues/1515)
- as to "enable embedding outputs of commands", qsv is composable by design, so you can use standard stdin/stdout redirection/piping techniques to have it work with other CLI tools like jq, awk, etc.
Finally, just released v0.120.0 that already incorporates the less aggressive self-update check. https://github.com/jqnatividad/qsv/releases/tag/0.120.0

vnlog

24 160 5.7 Perl

Process labelled tabular ASCII data using normal UNIX tools

For simple analyses (i.e. what most people do most of the time) doing this on the commandline gets you there faster. I use vnlog (https://github.com/dkogan/vnlog/). By the time you fired up your editor to write your Python code, I already have analyses and plots ready.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
miller

63 8,614 9.0 Go

Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
xsv

64 10,138 0.0 Rust

A fast CSV command line toolkit written in Rust.
teip

5 524 7.9 Rust

Masking tape to help commands "do one thing well"
citation-file-format

8 429 6.7 Python

The Citation File Format lets you provide citation metadata for software or datasets in plaintext files that are easy to read by both humans and machines.

I am somewhat tickled at the thought of citing everything in a malicious compliance kind of way. Given a Nix environment, it should be possible to pull down a list of every bit of code that was used to construct the OS. Would we have to differentiate between installed vs executed code? My Latex environment probably has thousands of packages, though I might directly only include a handful of them. Even if I include a Latex package, it might not get executed.
The CITATION.cff format[0] is a newish format to solve the machine identification of citable works, but I suspect it is too new to see widespread adoption. It is going to take some backbreaking regexes to extract "How to Cite" sections embedded in READMEs and buried in the source.
[0] https://citation-file-format.github.io/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Anyone else feel like they are using Pandas as a crutch?

1 project | /r/dataengineering | 5 Mar 2023
xsv

1 project | /r/ITProTuesday | 3 Mar 2023
Using Commandline To Process CSV files

1 project | /r/programming | 14 Dec 2022
How do I delete lines in a CSV using Sed based on condition?

2 projects | /r/commandline | 26 Jul 2022
Write a program in Rust to read a CSV file and create two output CSV files – one file with odd rows and the other file with even rows from the input file

1 project | /r/rust | 17 Jun 2022

Qsv: Efficient CSV CLI Toolkit

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Command-line CSV CLI Tsv Rust
Post date: 22 Dec 2023

qsv

vnlog

InfluxDB

miller

xsv

teip

citation-file-format

Related posts

Anyone else feel like they are using Pandas as a crutch?

xsv

Using Commandline To Process CSV files

How do I delete lines in a CSV using Sed based on condition?

Write a program in Rust to read a CSV file and create two output CSV files – one file with odd rows and the other file with even rows from the input file

Qsv: Efficient CSV CLI Toolkit

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Command-line CSV CLI Tsv Rust Post date: 22 Dec 2023

qsv

vnlog

InfluxDB

miller

xsv

teip

citation-file-format

Related posts

Anyone else feel like they are using Pandas as a crutch?

xsv

Using Commandline To Process CSV files

How do I delete lines in a CSV using Sed based on condition?

Write a program in Rust to read a CSV file and create two output CSV files – one file with odd rows and the other file with even rows from the input file

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Command-line CSV CLI Tsv Rust
Post date: 22 Dec 2023