Wednesday, August 18, 2010

CouchDB: Using List Functions to sort Map/Reduce-Results by Value

I just found out that it is possible to sort the result of Map/Reduce with a list function.

Let's take the simple example that you want to count all documents grouped by a field called type. The following map function emits the values of the type fields of all documents:

function(doc) {
  emit(doc.type, 1);
}

To sum up the documents with the same value in the type field, we just need this well-known reduce function:

function(key, values) {
  return sum(values)
}
By default CouchDB yields the result ordered by the keys. But if you want to order the result by occurrences of the type of the document you either have to sort it in your app or you use a list function like this:
function(head, req) { 
  var row
  var rows=[]
  while(row = getRow()) { 
    rows.push(row)  
  } 
  rows.sort(function(a,b) { 
    return b.value-a.value
  }) 
  send(JSON.stringify({"rows" : rows}))
}
If you save the list function as sort and the Map/Reduce-functions as count together in a design document, you can fetch your sorted result like this:
curl http://.../design-doc/_list/sort/count?group=true

Of course there are other options to sort a view result. I didn't found much documentation on this topic, but this thread at stackoverflow is very informative.

Back to the couch - Cheers!

11 comments:

  1. Good stuff, keep these posts coming!

    Thank you,
    Dennis

    ReplyDelete
  2. This list function is exactly what I was looking for. Unfortunately, when I tried to implement it, I got the following error:

    {"error":"compilation_error","reason":"Expression does not eval to a function...}

    I implemented it exactly as you listed above and tried to access it via browser at the URL you have listed. Any ideas on what is missing/wrong?

    Thanks
    Chris
    csj790@yahoo.com

    ReplyDelete
  3. Hi Chris,
    i guess its an issue with quoting. Did you implement the list function in your browser using futon? Than you have to quote the two double-quotes.

    Here is a copy and paste version:

    {
    "sort": "function(head, req) {
    var row
    var rows=[]
    while(row = getRow()) {
    rows.push(row)
    }
    rows.sort(function(a,b) {
    return b.value-a.value
    })
    send(JSON.stringify({\"rows\" : rows}))
    }
    "
    }

    BTW, I use CouchDB in version 1.0.1.

    cheers,
    Arbo

    ReplyDelete
  4. I know the rendering is probably not going to work, but I have included a snapshot of my design doc. based on this doc, I am going to
    ..._design/example/_list/sort/srcIP?group=True

    I am also using CouchDB 1.0.1

    {
    "_id" : "_design/example",
    "_rev" : "27-514ca602f74c626c4e3cb331520e32b7",

    "views" : {
    "srcIP" : {
    "map" : "function(doc) {
    emit(doc.in_srcIP, 1);
    }",

    "reduce" : "function(key, values, rereduce) {
    return sum(values);
    }"

    "dstIP" : {
    "map" : "function(doc) {
    emit(doc.in_dstIP, 1);
    }",

    "reduce" : "function(key, values, rereduce) {
    return sum(values);
    }"
    }
    },

    "lists" : {
    "sort" : "function(head, req) {
    var row
    var rows=[]
    while(row = getRow()) {
    rows.push(row)
    }
    rows.sort(function(a,b) {
    return b.value-a.value
    })
    send(JSON.stringify({\"rows\" : rows}))
    }"
    }
    }

    ReplyDelete
  5. Ok, I got it. The newlines were not properly quoted. Try to paste the following:

    {
    "sort": "\u000afunction(head, req) { \u000a var row\u000a var rows=[]\u000a while(row = getRow()) { \u000a rows.push(row) \u000a } \u000a rows.sort(function(a,b) { \u000a return b.value-a.value\u000a }) \u000a send(JSON.stringify({\"rows\" : rows}))\u000a}"
    }

    Editing lists functions in futon is not very relaxing. Try to use a texteditor and curl or similar.

    cheers,
    Arbo

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. That worked! Thank you tremendously!

    ReplyDelete
  8. Found out the root cause to my problem. When I imported the doc into Couch, it didn't create a \n or line break after each line of the list function so to create separation between each line, I had to put a separator (;) in place which resulted in the following list function:

    "lists":{
    "sort":"function(head, req) {
    var row;
    var rows=[];
    while(row = getRow()) {
    rows.push(row)
    };
    rows.sort(function(a,b) {
    return b.value-a.value
    });
    send(JSON.stringify({\"rows\" : rows}))
    }"

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. Exactly what I was looking for!

    ReplyDelete
  11. Thanks for sharing

    ReplyDelete