Node's package manager npm is a wonderful tool.
It handles dependencies and versions the right way. It requires simple, easy to write package metadata. It uses a central registry (by default) which makes installing modules easier. The central registry is CouchDB which basically makes it completely transparent and available to everyone.
It does many things right.
But it doesn't do search that well.
What just happened here?
Here is what happened: npm search gave us all packages that contain the substring "orm". Anywhere.
You might argue that this works well with bigger words. Its true that results are slightly better with bigger words but they're still not sorted in any meaningful way (alphabetically sorting search results isn't very meaningful)
Hence one of the common activities to do when researching node modules is to go to the #node.js IRC channel and ask the people there for a good library that does X.It handles dependencies and versions the right way. It requires simple, easy to write package metadata. It uses a central registry (by default) which makes installing modules easier. The central registry is CouchDB which basically makes it completely transparent and available to everyone.
It does many things right.
But it doesn't do search that well.
9134 % npm search orm npm http GET https://registry.npmjs.org/-/all/since?stale=update_after&startkey=1353539108378 npm http 200 https://registry.npmjs.org/-/all/since?stale=update_after&startkey=1353539108378 NAME DESCRIPTION 2csv A pluggable file format converter into Co... abnf Augmented Backus-Naur Form (ABNF) parsing. accounting number, money and currency parsing/formatt.. activerecord An ORM that supports multiple database sys.. addressit Freeform Street Address Parser ... [snip] ...
What just happened here?
Here is what happened: npm search gave us all packages that contain the substring "orm". Anywhere.
You might argue that this works well with bigger words. Its true that results are slightly better with bigger words but they're still not sorted in any meaningful way (alphabetically sorting search results isn't very meaningful)
9144 % npm search mysql NAME DESCRIPTION Accessor_MySQL A MySQL database wrapper, provide ... any-db Database-agnostic connection pool ... autodafe mvc framework for node with mysql ... connect-mysql a MySQL session store for connect ... connect-mysql-session A MySQL session store for node.js ... cormo ORM framework for Node.js... ... [snip] ...
I decided to make a package that helps with this, called npmsearch. Its a command-line tool that allows you to search the npm registry by keywords and it sorts the results using relevance and the number of downloads that the package has.
Install it using npm:
[sudo] npm install -g npmsearchthen use it from the command line:
9147 % npmsearch mysql * mysql (6 15862) A node.js driver for mysql. It is written in JavaScript, does not require compiling, and is 100% MIT licensed. by Felix GeisendörferIf you want to try it out without installing it, you can try it online, or you can visit the project page on github* mongoose (2 28197) Mongoose MongoDB ODM by Guillermo Rauch http://github.com/LearnBoost/mongoose.git * patio (10 174) Patio query engine and ORM by Doug Martin git@github.com:c2fo/patio.git * mysql-libmysqlclient (5 1019) Binary MySQL bindings for Node.JS by Oleg Efimov https://github.com/Sannis/node-mysql-libmysqlclient.git * db-mysql (3 918) MySQL database bindings for Node.JS * sql (6 51) sql builder by brianc http://github.com/brianc/node-sql.git * sequelize (2 2715) Multi dialect ORM for Node.JS by Sascha Depold
The implemented keyword search is non-trivial: it applies the Porter Stemmer to the keywords and expands the set provided by you with statistically picked commonly co-occuring keywords. (e.g. mongo will expand to mongo mongodb)
Results are sorted by a combined factor which incorporates keyword relevance and "half-lifed" downloads. You can control the importance of each factor in the sorting process using command-line options - and there are many:
- relevance - how big of a factor should keyword relevance be, default 2
- downloads - how big of a factor is the number of downloads, default 0.25
- halflife - the halflife of downloads e.g. 7 means downloads that are 7 days old lose half of their value, default 30
- limit - number of results to display, default 7
- freshness - update the database if older than "freshness" days, default 1.5
Have fun!
No comments:
Post a Comment